Lectures on Advanced 
Mathematical Methods 
for Physicists 


Sunil Mukhi 
N Mukunda 


HINDUSTAN p ae. 
Fig fi BOOK AGENCY Yo World Scientific 


Lectures on Advanced 
Mathematical Methods 
for Physicists 


Sunil Mukhi 


Tata Institute of Fundamental Research, India 


N Mukunda 


formerly of Indian Institute of Science, India 


Aah HINDUSTAN 
BOOK AGENCY 


YS world Scientific 


NEW JEASEY - LONDON + SINGAPORE - BEIJING : SHANGHAI - HONG KONG - TAIPE] - CHENNAI 


Published by 

World Scientific Publishing Co. Pte. Ltd. 

5 Toh Tuck Link, Singapore 596224 

USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601 
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE 


British Library Cataloguing-in-Publication Data 
A catalogue record for this book is available from the British Library. 


LECTURES ON ADVANCED MATHEMATICAL METHODS FOR PHYSICISTS 
Copyright © 2010 Hindustan Book Agency (HBA) 


Authorized edition by World Scientific Publishing Co. Pte. Ltd. for exclusive distribution worldwide 
except India. 


The distribution rights for print copies of the book for India remain with Hindustan Book Agency (HBA). 


All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, 
electronic or mechanical, including photocopying, recording or any information storage and retrieval 
system now known or to be invented, without written permission from the Publisher. 


ISBN-13 978-981-4299-73-2 
ISBN-10 981-4299-73-1 


Printed in India, bookbinding made in Singapore. 


Contents 


Part I: Topology and Differential Geometry 


Introduction to Part I 


1 Topology 
TT? SPYOMMINAPIES save S29 4045S Sie A alas ee Eta ee ee dl een 
1.2 Topological Spaces 2... 1 ee ee ee 
1.3 Metric spaces... 2... 2. ee ee ee 
1.4 Basis for atopology .............00. 0000 eee eeee 
1.5: Closure 3.2.5 on: seeeing Be ear ata ap weal ashes 0A abd & 
1.6 Connected and Compact Spaces............00 00004 
1.7 Continuous Functions ..........0 20... 000. e eee 
1.8 Homeomorphisms.............0 20.00 eevee evae 
1.9 Separability .......0.02.0 2.2... 000... 00 ce eee eae 

2 Homotopy 
2.1 Loops and Homotopies.............2.0 0000 e eee 
2.2 The Fundamental Group. ............0200200008 
2.3. Homotopy Type and Contractibility ................ 
2.4 Higher Homotopy Groups .............22. 000 eee 

3 Differentiable Manifolds I 
3.1 The Definition of a Manifold .................04. 
3.2 Differentiation of Functions .......-..... 0.000 vee 
3:3 “Orientability: x) 4:4).528 26 Son ee ho ee OE ae 
3.4 Calculus on Manifolds: Vector and Tensor Fields ......... 
3.5 Calculus on Manifolds: Differential Forms ............. 
3.6 Properties of Differential Forms ............-20000. 
3.7 More About Vectors and Forms.............0.0.000.4 


4 Differentiable Manifolds II 
4.1 Riemannian Geometry .. 2... 2... 2.2 e eee ee es 


vi Contents 


ALD: SEPAMES?.) alae wed eid Bb Ala Aoi fe bE Ge athe ee ee a Be 67 
4.3 Connections, Curvature and Torsion ...............-. 69 
4.4 The Volume Form ..... 2... 2-2... eee ee ee 74 
4.5. Isometry: 4. <2: 208 cob oe eb gees pe ee oth es ee ee 76 
4.6 Integration of Differential Forms ...........-2.-22006- 77 
4.7 Stokes’ Theorem ........0.. 2-00 eee ee es 80 
4.8 The Laplacianon Forms .........-.-.-.. 00000080 ee 83 
5 Homology and Cohomology 87 
5.1 Simplicial Homology ..........-. 0008 0s 87 
5.2 De Rham Cohomology ............0..0 0000-02 ee 100 
5.3 Harmonic Forms and de Rham Cohomology ............ 103 
6 Fibre Bundles 105 
6.1 The Concept of a Fibre Bundle ..........--.......-. 105 
6.2 Tangent and Cotangent Bundles ..............00005 111 
6.3 Vector Bundles and Principal Bundles ............... 112 
Bibliography for Part I 117 
Part II: Group Theory and Structure and Representations of Com- 
pact Simple Lie Groups and Algebras 119 
Introduction to Part II 121 
7 Review of Groups and Related Structures 123 
7.1 Definition ofa Group. .....-..----0 20. eee ee eee 123 
7.2 Conjugate Elements, Equivalence Classes .....-...-..... 124 
7.3 Subgroups and Cosets .. 2.2... ee ee ees 124 
7.4 Invariant (Normal) Subgroups, the Factor Group ......... 125 
7.5 Abelian Groups, Commutator Subgroup .............. 126 
7.6 Solvable, Nilpotent, Semisimple and Simple Groups. ....... 127 
7.7 Relationships Among Groups ........-.-...2 000-4 ee 129 
7.8 Ways to Combine Groups — Direct and Semidirect Products .. 131 
7.9 Topological Groups, Lie Groups, Compact Lie Groups ...... 132 
8 Review of Group Representations 135 
8.1 Definition of a Representation. ..........--.-.-++0-5 135 
8.2 Invariant Subspaces, Reducibility, Decomposability ........ 136 
8.3 Equivalence of Representations, Schur’s Lemma....... ... 138 
8.4 Unitary and Orthogonal Representations. ............. 139 


8.5 Contragredient, Adjoint and Complex Conjugate Representations 140 
8.6 Direct Products of Group Representations............. 144 


Contents vil 


9 Lie Groups and Lie Algebras 147 
9.1 Local Coordinates ina Lie Group... . 2.2... 22220005 147 
9.2 Analysis of Associativity.........-....2.2.-.+22--00- 148 
9.3 One-parameter Subgroups and Canonical Coordinates ...... 151 
9.4 Integrability Conditions and Structure Constants ......... 155 


9.5 Definition of a (real) Lie Algebra: Lie Algebra of a given Lie Group157 


9.6 Local Reconstruction of Lie Group from Lie Algebra ....... 158 
9.7. Comments on theG—G Relationship .............. 160 
9.8 Various Kinds of and Operations with Lie Algebras. ....... 161 
10 Linear Representations of Lie Algebras 165 
11 Complexification and Classification of Lie Algebras 171 
11.1 Complexification of a Real Lie Algebra... ...........4. 171 
11.2 Solvability, Levi’s Theorem, and Cartan’s Analysis of Complex 
(Semi) Simple Lie Algebras ............2.. 0.00004 173 
11.3 The Real Compact Simple Lie Algebras .............. 180 
12 Geometry of Roots for Compact Simple Lie Algebras 183 
13 Positive Roots, Simple Roots, Dynkin Diagrams 189 
13.1 Positive Roots: «2 28h bah ae eS ee oe eee 189 
13.2 Simple Roots and their Properties ..............0.0.. 189 
13.3 Dynkin Diagrams... 0... ee 194 
14 Lie Algebras and Dynkin Diagrams for SO(2/), SO(2/+1), USp(2t), 
SU(i+ 1) 197 
14.1 The SO(2/) Family — D; of Cartan... ........-2.04. 197 
14.2 The SO(2i + 1) Family — B, of Cartan... 2.2... eee 201 
14.3 The USp(2/) Family — C; of Cartan ©... 1... ee ee ee 203 
14.4 The SU(i +1) Family — A; of Cartan ........-..--2, 207 


14.5 Coincidences for low Dimensions and Connectedness 


15 Complete Classification of All CSLA Simple Root Systems 215 


15.1 Series of Lemmas... 2... ee et 216 
15.2 The allowed Graphs fo... 1... ee ee 220 
15.3 The Exceptional Groups... 22... 2. ee ee ee ee ee ee 224 
16 Representations of Compact Simple Lie Algebras 227 
16.1 Weights and Multiplicities .. 0... 2... ee ee ee eee 227 
16.2 Actions of E, and SU(2)*) — the Wey] Group.......... 228 
16.3 Dominant Weights, Highest Weight ofa UIR ........... 230 
16.4 Fundamental UIR’s, Survey of all UIR’s ..........-0-. 233 


16.5 Fundamental UIR’s for A;,B:,Ci,Di ..-- ee 234 


vill Contents 


16.6 The Elementary UIR’s...............----.200-- 
16.7 Structure of States withina UIR ................ “ies 


17 Spinor Representations for Real Orthogonal Groups 

17.1 The Dirac Algebra in Even Dimensions. .............. 

17.2 Generators, Weights and Reducibility of U(S) — the spinor UIR’s 
of D oy eae by ean Taco tas Cason got gad sie nre) Moh Cor Vel ca GauPe anh. Se res Bh! ee, bi aah oS a 

17.3 Conjugation Properties of Spinor UIR’s of D) ........... 

17.4 Remarks on Antisymmetric Tensors Under D; = SO(2I) 

17.5 The Spinor UIR’s of B) = SO(2i+-1)...........-2.00. 

17.6 Antisymmetric Tensors under B; = SO(2i+1) .......... 


18 Spinor Representations for Real Pseudo Orthogonal Groups 
18.1 Definition of SO(g, p) and Notational Matters ........... 
18.2 Spinor Representations S(A) of SO(p,q) forp+q=2l...... 
18.3 Representations Related to S{A) ..........2..0220004 
18.4 Behaviour of the Irreducible Spinor Representations S4(A) . 
18.5 Spinor Representations of SO(p,qg) forp+q=2l+1....... 
18.6 Dirac, Weyl and Majorana Spinors for SO(p,q) .....-.-..- 


Bibliography for Part II 


Index 


273 


275 


Part I: Topology and 
Differential Geometry 


Sunil Mukhi 


Department of Theoretical Physics 
Tata Institute of Fundamental Research 
Mumbai 400 005, India 


+J.C, Bose Fellow. 


Introduction to Part I 


These notes describe the basic notions of topology and differentiable geometry in 
a style that is hopefully accessible to students of physics. While in mathematics 
the proof of a theorem is central to its discussion, physicists tend to think of 
mathematical formalism mainly as a tool and are often willing to take a theorem 
on faith. While due care must be taken to ensure that a given theorem is actually 
a theorem (i.e. that it has been proved), and also that it is applicable to the 
physics problem at hand, it is not necessary that physics students be able to 
construct or reproduce the proofs on their own. Of course, proofs not only 
provide rigour but also frequently serve to highlight the spirit of the result. I 
have tried here to compensate for this loss by describing the motivation and 
spirit of each theorem in simple language, and by providing some examples. 

The examples provided in these notes are not usually taken from physics, 
however. I have deliberately tried to avoid the tone of other mathematics-for- 
physicists textbooks where several physics problems are set up specifically to 
illustrate each mathematical result. Instead, the attempt has been to highlight 
the beauty and logical flow of the mathematics itself, starting with an abstract 
set. of points and adding “qualities” like a topology, a differentiable structure, a 
metric and so on until one has reached all the way to fibre bundles. 

Physical applications of the topics discussed here are to be found primarily 
in the areas of general relativity and string theory. It is my hope that the 
enterprising student interested in researching these fields will be able to use 
these notes to penetrate the dense physical and mathematical formalism that 
envelops (and occasionally conceals!) those profound subjects. 


Chapter 1 


Topology 


1.1 Preliminaries 


Topology is basically the study of continuity. A topological space is an abstract 
structure defined on a set, in such a way that the concept of continuous maps 
makes sense. In this chapter we will first study this abstract structure and then 
go on to see, in some examples, why it is a sensible approach to continuous 
functions. 

The physicist is familiar with ideas of continuity in the context of real 
analysis, so here we will use the real line as a model of a topological space. 
For a mathematician this is only one example, and by no means a typical one, 
but for a physicist the real line and its direct products are sufficient to cover 
essentially all cases of interest. 

In subsequent chapters, we will introduce additional structures on a topo- 
logical space. This will eventually lead us to manifolds and fibre bundles, the 
main “stuff” of physics. 

The following terms are in commen use and it will be assumed that the 
reader is familiar with them: set, subset, empty set, element, union, intersection, 
integers, rational numbers, real numbers. The relevant symbols are illustrated 
below: 


subset: Cc empty set: ¢ element: € 
union: U intersection: Nn 
set of integers: Z set of rational numbers: @Q set of real numbers: R 


We need some additional terminology from basic set theory that may also 
be known to the reader, but we will explain it nevertheless. 


Definition: If A C B, the complement of A in B, called A’ is 


A={xeEB|x¢éA} 


6 Chapter 1. Topology 


For the reader encountering such condensed mathematical statements for 
the very first time, let us express the above statement in words: A’ is the set 
of all elements z that are in B such that x is not in A. The reader is encour- 
aged to similarly verbalise any mathematical statement that is not immediately 
comprehensible. 

We continue with our terminology. 


Definition: The Cartesian product of two sets A and B is the set 
A®B={(a,b)|aeA, bE B} 


Thus it is a collection of ordered pairs consisting of one element from each of 
the original sets. 


Example: The Cartesian product IR» R is called IR’, the Euclidean plane. One 
can interate Cartesian products any number of times, for example RxIRx---xR 
is the n-dimensional Euclidean space IR”. 


We continue by defining functions and their basic properties. 


Definition: A function A > B is a rule which, for each a € A, assigns a unique 
b € B, called the image of a. We write b = f(a). 


A function f : A — B is surjective (onto) if every b € B is the image of at 
least one a € A. 


A function f : A— B is injective (one-to-one) if every b € B is the image 
of at most one a € A. 


A function can be surjective, or injective, or neither, or both. If it is both, 
it is called bijective. In this case, every b € B is the image of ezactly one a € A. 

Armed with these basic notions, we can now introduce the concept of a 
topological space. 


1.2 Topological Spaces 


Consider the real line IR. We are familiar with two important kinds of subsets 
of R: 


Open interval: (a,b)={creER|a<zr<b} 
Closed interval: [a,b] ={zeR|a<zr<b} (1.1) 


A closed interval contains its end-points, while an open interval does not. Let 
us generalize this idea slightly. 


Definition: X C R is an open set in R ir: 


ze X +2 € (a,b) C X for some (a,b) 


1.2. Topological Spaces 7 
Be ee th 


In words, a subset X of IR will be called open if every point inside it can be 
enclosed by an open interval (a,b) lying entirely inside it. 


Examples: 

(i) Every open interval (a,b) is an open set, since any point inside such an 
interval can be enclosed in a smaller open interval contained within (a, b). 

(ii) X = {x € IR| x >0} is anopen set. This is not, however, an open interval. 
(iii) IR itself is an open set. 

(iv) The empty set ¢ is an open set. This may seem tricky at first, but one 
merely has to check the definition, which goes “every point inside it can be ...”. 
Such a sentence is always true if there are no points in the set! 

(v) A closed interval is not an open set. Take X = [a,b]. Thena € X andbe X 
cannot be enclosed by any open interval lying entirely in [a,b]. All the remaining 
points in the closed interval can actually be enclosed in an open interval inside 
[a,b], but the desired property does not hold for all points, and therefore the 
set fails to be open. 

(vi) A single point is not an open set. To see this, check the definition. 


Next, we define a closed set in R. 
Definition: A closed set in IR is the complement in IR of any open set. 


Examples: 

(i) A closed interval [a, 6] is a closed set. It is the complement of the open set 
X={reR|[x>b}ulreR|xe<a}. 

(ii) A single point {a} is a closed set. It is the complement in IR of the open set 
X={xeER|zer>a}u{reR|zc<a}. 

(iii) The full set IR and the empty set ¢ are both closed sects, since they are both 
open sets and are the complements of each other. 


We see that a set can be open, or closed, or neither, or both. For example, 
{a, b) is neither closed or open (this is the set that contains the end-point @ but 
not the end-point 5). Since @ cannot be enclosed by an open set, [a,5) is not 
open. In its complement, b cannot be enclosed by any open set. So [a,b) is not 
closed either. In IR, one can check that the only sets which are both open and 
closed according to our definition are R and ¢. 

It is important to emphasize that so far, we are only talking about open 
and closed sets in the real line R. The idea is to extract some key properties 
of these sets and use those to define open sets and closed sets in an abstract 
setting. Continuing in this direction, we note some interesting properties of 
open and closed sets in R: 

(a) The union of any number of open sets in IR is open. This follows from the 
definition, and should be checked carefully. 
(b) The intersection of a finite number of open sets in IR is open. 

Why did we have to specify a finite number? The answer is that by taking 

the intersection of an infinite number of open sets in JR, we can actually manu- 


8 Chapter 1. Topology 


facture a set which is not open. As an example, consider the following infinite 
collection of open intervals: 


{ (5. =) Fes thoes } (1.2) 


As n becomes very large, the sets keep becoming smaller. But the point 0 is . 
contained in each set, whatever the value of n. Hence it is also contained in 
their intersection. However any other chosen point a € R will lie outside the 
set (—4, 1) once n becomes larger than Tal” Thus the infinite intersection over 
all n contains only the single point 0, which as we have seen is not an open set. 
This shows that only finite intersections of open sets are guaranteed to be open. 
Having discussed open and closed sets on IR, we are now in a position to 
extend this to a more abstract setting, which is that of a topological space. This 
will allow us to formulate the concepts of continuity, limit points, compactness, 
connectedness etc. in a context far more general than real analysis. 


Definition: A topological space is a set S together with a collection U of subsets 
of S, called open sets, satisfying the following conditions: 

(i) @EU, SEU. 

(ii) The union of arbitrarily many u; € U is again in U (thus the union of any 
number of open sets is open). 

(iii) The intersection of finitely many subsets u; € U is again in U (thus the 
intersection of finitely many open sets is open). 

The pair (S,U) is called a topological space. The collection U of subsets of S is 
called a topology on S. 


The reader will notice that if the set S happens to coincide with the real 
line IR, then the familiar collection of open sets on IR satisfies the axioms above, 
and so IR together with its usual collection of open sets provides an example of 
a topological space. But as we will soon see, we can put different topologies on 
the same set of points, by choosing a different collection of subsets as open sets. 
This will lead to more general (and strange!) examples of topological spaces. 

Having defined open sets, it is natural to define closed sets as follows: 


Definition: In a given topological space, a closed set is a subset of S which is 
the complement of an open set. Thus, if ue U, then U' = { cre S| x €¢u } is 
closed. 


Examples: 

(i) Let S be any set whatsoever. Choose U to be the collection of two sets 
{ ¢,S }. This is the smallest collection we can take consistent with the ax- 
ioms! Clearly all the axioms are trivially satisfied. This is called the trivial or 
sometimes indiscrete topology on S. 

(ii) Let S again be any set. Choose U to be the collection of ail subsets of S. 
This is clearly the largest collection we can possibly take. In this topology, all 
subsets are open. But they are all closed as well (check this!). This is another 


1.3. Metric spaces 9 


trivial example of a topology, called the discrete topology, on S. We see, from 
this example and the one above, that it is possible to define more than one 
topology on the same set. We can also guess that if the topology has too few 
or too many open sets, it is liable to be trivial and quite uninteresting. 
(iii)] Let S be the real line IR. U is the collection of all subsets X of IR such 
that 

rEeXsare(adb)cxX. (1.3) 


This is our old definition of “open set in IR”. We realise now that it was not the 
unique choice of topology, but it was certainly the most natural and familiar. 
Accordingly, this topology is called the usual topology on R. 

{iv) S is a finite set, consisting of, say, six elements. We write 


S = {a,b,c,d,e, f} (1.4) 


Choose 
U = {¢,S, {a, b}, {bd}, {b, c}, {a, 5, ch} (1.5) 


This defines a topology. Some of the closed sets (besides ¢ and S) are {d,e, f} 
and {a,c,d,e, f}. This example shows that we can very well define a topology 
on a finite set. 


Exercise: If we leave out {a,b} in U, do we still get a topology? What if we 
leave out {b}? What if we add {d}? 


1.3. Metric spaces 


A topological space carries no intrinsic notion of metric, or distance between 
points. We may choose to define this notion on a given topological space if 
we like. We will find that among other things, introducing a metric helps to 
generate many more examples of topological spaces. 


Definition: A metric space is a sect S along with a map which assigns a real 
number > 0 to each pair of points in S, satisfying the conditions below. If 
zéS, y€S then d(z,y) should be thought of as the distance between x and 
y. The conditions on this map are: 

(i) d(x, y) = 0 if and only if x = y. 

(ii) d(x, y) = d(y,z) 

(iii) d(v, z) < d(x, y) + d(y, z) (triangle inequality). 

The map d: S x S — Rt is called a metric on S. 


We see from the list of axioms above, that we are generalising to abstract 
topological spaces the familiar notion of distance on Euclidean space. Later 
on we will see that Euclidean space is really a “differentiable manifold with a 
metric”, which involves a lot more structure than we have encountered so far. 
It is useful to keep in mind that a metric can make sense without any of that 
other structure — all we need is a set, not even a topological space. In fact we 


10 Chapter 1. Topology 


will see in a moment that defining a metric on a set actually helps us define a 
topological space. 


Examples: On IR we may choose the usual metric d(z,y) = |z—y|. On IR? we 
can similarly choose d(x, y) = |Z — gy]. The reader should check that the axioms 
are satisfied. 


Exercise: Does d(x, y) = (z — y)* define a metric on IR? 


Given any metric on a set S, we can define a particular topology on S, 
called the metric topology. 


Definition: With respect to a given metric on a set S, the open disc on S at 
the point xz € S with radius a > 0 is the set 


Dz(a)={yeES|d(z,y) <a} 


The open disc is just the set of points strictly within a distance a of the 
chosen point. However, we must remember that this distance is defined with 
respect to an abstract metric that may not necessarily be the one familiar to 
us. 

Having defined open discs via the metric, we can now define a topology 
(called the metric topology) on S as follows. A subset X C S will be called an 
open set if every x © X can be enclosed by some open disc contained entirely 
in X: 

X isopenifzeX >xeD,(a)c XxX (1.6) 


for some a. This gives us our collection of open sets X, and together with the 
set S, we claim this defines a topological space. 


Exercise: Check that this satisfies the axioms for a topological space. 


We realise that the usual topology on R is just the metric topology with 
d(z,y) = |x —y|. The metric topology also gives the usual topology on n- 
dimensional Euclidean space IR”. The metric there is: 


d(a*, y*) = (1.7) 


where z', y*, i =1,2,--- ,n are the familiar coordinates of points in IR”. 

The open discs are familiar too. For IR? they are the familiar discs on the 
plane (not including their boundary), while for IR® they are the interior of a 
solid ball about the point. In higher dimensions they are generalisations of this, 
so we will use the term “open ball” to describe open discs in any IR”. 

Note that for a given set of points we can define many inequivalent met- 
rics. On JR, for example, we could use the rather unorthodox (for physicists) 


1.4. Basis for a topology 11 


definition: 


d(z,y)=1, r#y 
=0, r=y. 


Thus every pair of distinct points is the same (unit) distance apart. 


Exercise: Check that this satisfies the axioms for a metric. What do the open 
discs look like in this metric? Show that the associated metric topology is one 
that we have already introduced above. 


1.4 Basis for a topology 


Defining a topological space requires listing all the open sets in the collection 
U. This can be quite tedious. If there are infinitely many open sets it might 
be impossible to list all of them. Therefore, for convenience we introduce the 
concept of a basis. 

For this we return to the familiar case — the usual topology on IR. Here, the 
open intervals (a,b) form a distinguished subclass of the open sets. But they 
are not all the open sets (for example the union of several open intervals is an 
open set, but is not itself an open interval. If this point was not clear then it is 
time to go back and carefully review the definition of open sets on R!). 


The open intervals on IR have the following properties: 
(i) The union of all open intervals is the whole set R: 
U(ai,b:) =R (1.8) 


(ii) The intersection of two open intervals can be expressed as the union of other 
open intervals. For example, if a, < a2 <b) < bg then 


(a1,b1) M (a2, b2) = (a2, 61) (1.9) 


(iii) @ is an open interval: (a,a) = ¢. 


These properties can be abstracted to define a basis for an arbitrary topological 
space. 


Definition: In an arbitrary topological space (S, UV), any collection of open sets 
(a subset of the full collection UV) satisfying the above three conditions is called 
a basis for the topology. 


A basis contains only a preferred collection of open sets that “generates” 
(via arbitrary unions and finite intersections) the complete collection U. And 
there can be many different bases for the same topology. 


Example: In any metric space, the open discs provide a basis for the metric 


12 Chapter 1. Topology 


topology. The reader should check this statement. 


Exercise: In the usual topology on R?, give some examples of open sets that 
are not open discs. Also find a basis for the same topology consisting of open 
sets that are different from open discs. 


1.5 Closure 


In many familiar cases, we have seen that the distinction between open and 
closed sets is that the former do not contain their “end points” while the latter 
do. In trying to make this more precise, we are led to the concept of closure of 
a set. 

In IR?, for example, the set X, = { x C IR? | ? < 1 } defines an open 
set, the open unit disc. It is open because every point in it can be enclosed in 
a small open disc which lies inside the bigger one (see Fig. 1.1). Points on the 
unit circle Z? = 1 are not in the unit open disc, so we don’t have to worry about 
enclosing them. 


Figure 1.1: The open unit disc. Every point in it can be enclosed in an open 
disc. 


On the other hand consider the set Xz = { x C R? | 2 <1}. In addition 
to points within the unit circle, this set contains all points on the unit circle. 
But clearly X2 is not an open set, since points on the boundary circle cannot 
be enclosed by open discs in X9. In fact Xq is a closed set, as one can check by 
going back to the axioms. 

So it seems that we can add some points to an open set and make it a 
closed set. Let us make this precise via some definitions. 


Definition: Let (S,U) be a topological space. Let AC S. A point s € S is 
called a limit point of A if, whenever s is contained in an open set u € U, we 
have 


(u-{s})NA#¢ 


1.6. Connected and Compact Spaces 13 


In other words, S is a limit point of A if every open neighbourhood of S has a 
non-empty intersection with A. 


Exercise: Show that all points on the boundary of an open disc are limit points 
of the open disc. They do not, however, belong to the open disc. This shows 
that in general, a limit point of a set A need not be contained in A. 


Definition: The closure of a set AC S is: 
A= AU { limit points of A } 


In other words, if we add to a set. all its limit points, we get the closure of the 
set. This is so named because of the following result: 


Theorem: The closure of any set A C S is a closed set. 


Exercise: Prove this theorem. As always, it helps to go back to the definition. 
What you have to show is that the complement of the closure A is an open set. 


Given a topological space (S,U), we can define a topology on any subset 
Ac S. Simply choose U, to be the collection of sets u; A, u; C U. This 
topology is called the relative topology on A. Note that sets which are open in 
AC S in the relative topology need not themselves be open in S. 


Exercise: Consider subsets of IR and find an example to illustrate this point. 


1.6 Connected and Compact Spaces 


Consider the real line IR with the usual topology. If we delete one point, say 
{0}, then IR — {0} falls into two disjoint pieces. It is natural to say that IR is 
connected but IR — {0} is disconnected. To make this precise and general, we 
need to give a definition in terms of a given topology on a set. 


Definition: A topological space (S,U) is connected if it cannot be expressed 
as the union of two disjoint open sets in its topology. If on the other hand we 
can express it as the union of disjoint open sets, in other words if we can find 
open sects u;,u2 € U such that wu; N ue = ¢, uy Uug = S, then the space is said 
to be disconnected. 


Theorem: In a connected topological space (S,U), the only sets which are 
both closed and open are S and ¢. 


Exercise: Prove this theorem. 


This definition tells us in particular that with the usual topology, IR” is 
connected for all n. IR — {0} is disconnected, but IR” — {0} is connected for 
n > 2. On the other hand, the space IR? — {(z,0)|2 € IR}, which is IR? 
with a line removed, is disconnected. Similarly, IR? — { (z, y)|z? + y? = 1}, 


14 Chapter 1. Topology 


which is IR? with the unit circle removed, is disconnected. JR? minus a plane is 
disconnected, and so on. In these examples it is sufficient to rely on our intuition 
about connectedness, but in more abstract cases one needs to carefully follow 
the definition above. 

Note that connectivity depends in an essential way on not just the set, but 
also the collection of open sets U, in other words on the topology. For example, 
IR with the discrete topology is disconnected: 


R=U,{a}, aER (1.10) 


Recall that each {a} is an open set, disjoint from every other, in the discrete 
topology. We can also conclude that IR is disconnected in the discrete topology 
from a theorem stated earlier. In this topology, IR and ¢ are not the only sets 
which are both closed and open, since each {a} is also both closed and open. 


We now turn to the study of closed, bounded subsets of R. which will turn 
out to be rather special. 


Definition: A cover of a set X is a family of sets {F.} = F such that their 
union contains X, thus 
X C UnFa 


If, (S,U) is a topological space and X C S, then a cover {Fy} is said to be 
an open cover if Fy € U for all a, namely, if F, are all open sets. 


Now there is a famous theorem: 


Heine-Borel Theorem: Any open cover of a closed bounded subset of IR” (in 
the usual topology) admits a finite subcover. 


Let us see what this means for IR. An example of a closed bounded sub- 
set is an open interval [a,b]. The theorem says that if we have any (possibly 
infinite) collection of open sets {F,} which cover [a,b] then a finite subset of 
this collection also exists which covers [a,b]. The reader should try to convince 
herself of this with an example. 

An open interval (a,b) is bounded but not closed, while the semi-infinite 
interval (0,00) = { z € R| zx > 0 } is closed but not bounded. So the Heine- 
Borel theorem does not apply in these cases. And indeed, here is an example of 
an open cover of (a,b) with no finite subcover. Take (a,b) = (—1,1). Let 


1 1 
f= (-142,1- 2) n = 2,3,4,... (1.11) 
n nr 


With a little thought, one sees that U%2,F, = (—1,1). Therefore F, provides 


n= 
an open cover of (—1,1). But no finite subset of the collection {F,} is a cover 
of (—1,1). 


Exercise: Find an open cover of (0,00) which has no finite subcover. 


Closed bounded subsets of IR” have several good properties. They are 


1.7. Continuous Functions 15 


known as compact sets on IR”. Now we need to generalize this notion to arbitrary 
topological spaces. In that case we cannot always give a meaning to “bounded”, 
so we proceed using the equivalent notions provided by the Heine-Borel theorem. 


Definition: Given a topological space (S,U), a set X C S is said to be compact 
if every open cover admits a finite subcover. 


Theorem: Let S be a compact topological space. Then every infinite subset 
of S has a limit point. This is one example of the special properties of compact 
sets. 


Exercise: Show by an example that this is not true for non-compact sets. It 
is worth looking up the proof of the above theorem. 


Theorem: Every closed subset of a compact space is compact in the relative 
topology. Thus, compactness is preserved on passing to a topological subspace. 


1.7 Continuous Functions 


Using the general concept of topological spaces developed above, we now turn 
to the definition of continuity. Suppose (S,U) and (T, V) are topological spaces, 
and f: S — T isa function. Since f is not in general injective (one to one), it 
does not have an inverse in the sense of a function f-! : T > S$. But we can 
define a more gencral notion of inverse, which takes any subset of T to some 
subset of S. This will provide the key to understanding continuity. 


Definition: If T’ C T, then the inverse f-1(T’) C S is defined by: 


flW)={seES| f(s)eT’ } 


Note that the inverse is defined on all subsets T’ of T', including the indi- 
vidual elements {t} C T treated as special, single-element subsets. However it 
does not necessarily map the latter to individual elements s € S, but rather to 
subsets of S as above. The inverse evaluated on a particular subset of T may 
of course be the empty set ¢ C S. 

For the special case of bijective (one-to-one and onto) functions, single- 
element sets {t} C T will be mapped to single-element sets f~*({t}) C S. In 
this case we can treat the inverse as a function on T, and the definition above 
coincides with the usual inverse function. 


Consider an example in IR with the usual topology: 


Example: f : R — Rt is defined by f(z) = 2?. Then, f~!: {y} Cc Rt > 
{/97,-Vy} C IR (Fig. 1.2). Now take f7! on an open set of IR™, say (1,4). 
Clearly 


fo}: 4,4) CRt = {(1,2), (-1,—-2)} CR. (1.12) 


16 Chapter 1. Topology 


Figure 1.2: f(x) = x?, an example of a continuous function. 


Thus the inverse maps open sets of IR* to open sets of R. 


One can convince oneself that this is true for any continuous function R > 
IR*. Moreover, it is false for discontinuous functions, as the following example 
shows (Fig. 1.3): 


f(z) =a2+1, r<0 
=2z+2, x£>0 


Figure 1.3: An example of a discontinuous function. 


In this example, the open set (3 , 2) C Ris mapped by f~! to the set (0, 4 
which is not open. In fact it can be shown that on IRR with the usual oe 
the continuous functions are those for whom the inverse always takes open sets 
to open sets. For discontinuous functions (i.e. functions having a “break” in 
their graph) there will be at least one open set which is mapped by the inverse 


1.8. Homeomorphisms 17 


function to a non-open set. 
Exercise: Try to formulate a general proof of the above statement. 


As before, we use this property of the real linc as a way of defining contin- 
uous functions on arbitrary topological spaces. 


Definition: For general topological spaces (S,U) and (T,V), a function f : 
S — T is called continuous if its inverse takes open scts of T to open sets of S. 


Exercise: Show that if we put the discrete topology on S and T then every 
function f : S — T is continuous. Clearly this topology is too crude to capture 
any interesting information about continuity. Also if we put the indiscrete 
topology on both S and T then only very few functions are continuous (which 
are those?). So this topology is also rather uninteresting from the point of view 
of continuity. 


Exercise: Consider 2 x 2 real matrices 


M= E a (1.13) 


Think of the four entries a, b, c,d as points in IR* with the metric topology. Show 
that the “determinant map”, det : IR* —  R is continuous, where det(a, b,c, d) = 
ad — be. 


1.8  Homeomorphisms 


One may define many apparently different topological spaces, but some of these 
may be “equivalent” for all purposes. To make this notion precise we need to 
define a kind of map between topological spaces, such that whenever such a 
map exists, the spaces will be considered to be the same. Basically, what such 
a map should do is to establish a 1-1 correspondence between elements of the 
two spaces viewed as sets, in such a way that the open sets of one are mapped 
onto open sets of the other. 


Definition: f : S — T is a homeomorphism if f is bijective and both f and 
f7? are continuous. 


Since f is bijective, we can think of f—! as mapping points of T to unique 
points of S. The bijective property of f establishes an equivalence between 
points of S and points of T, while both-ways continuity ensures that open sets 
go to open sets in each direction. 

If a homeomorphism exists between two topological spaces S and T then 
they are said to be homeomorphic and we think of them thereafter as being the 
same. 


Theorem: If f : S — T is a homeomorphism then S is connected if and only 


18 Chapter 1. Topology 


if T is connected, and S is compact if and only if T is compact. 


Exercise: Prove this theorem. This is merely a matter of carefully working 
through the definitions. 


1.9 Separability 


We close this chapter with a few definitions and comments concluding with the 
notion of separability. 


Let (S,U) be a topological space. 


Definition: The interior of a subset X C S, written X°, is the union of all 
open sets u; contained in X: 


Xx°= UucxXuseu Ui 


Definition: The boundary of X C S is the difference between the closure X of 
X and the interior X° of X: 


b(X) =X — xe 
Examples: 
(i) Let X = (a,b) C R in the usual topology. Then X° = (a, wae ae b], and 
ies ) = {a,b}. The same is true for each of the choices X = (a,b], X = [a,d), 
= [a, 8]. 


a Let X = : (x,y) € R? | Fa ty? <1}, the unit open disc. Then X° = X, 
and X = { (z,y) € R? | z?+y? <1 }, which is the unit closed disc. The 
boundary is b(X) = { (x,y) € IR? | 2? + y? = 1 }, which is the unit circle. 

(iii) On IR with the usual topology, consider the subset Q={2zr¢€R| z= By 
where p€ Z,q€ Z— {0}. These are the rational numbers, One can see that every 
point of IR is a limit point of Q, because any open interval contains infinitely 
many rational numbers. Thus Q = IR. We say that the rationals are dense in 
the reals. Clearly the interior Q° of Q is ¢, since no open set fits inside the 
rationals, so that we also have b(Q) = 


An important concept in topological spaces is that of separability. Given 
two distinct points, it may be possible to enclose one in an open set which does 
not contain the other. One may be able to do better, and enclose both of them 
in two disjoint open sets. There are various degrees of separability which a 
topological space may have. The one which is most relevant in the study of 
manifolds and hence of physics is the following: 


Definition: A topological space S is Hausdorff if, whenever s1,82 are two 
distinct elements of S, there exist disjoint open sets u,,u2q such that s; € w1, 
S2 © U2. 


1.9. Separability 19 


Exercises: 

(i) Show that IR” with the usual topology is Hausdorff. 

(ii) Show that IR with the indiscrete topology is not Hausdorff. 

(iii) Consider the following type of space. S is any infinite set. u C S is an 
open set if u’ is a finite set. Check that this defines a topology on S. Is this a 
Hausdorff topology? 

(iv) Show that every metric space is Hausdorff. In fact, metric spaces are 
normal, a stronger condition implying that disjoint closed sets can be enclosed 
in disjoint open sets. 


Exercise: Show that every compact subset of a Hausdorff space is closed. 


Chapter 2 


Homotopy 


2.1 Loops and Homotopies 


In this chapter we discuss ways to understand the connectivity of a topological 
space. These will consist largely of the study of “closed loops” on a topological 
space, and the possibility of deforming these into each other. Many, though not 
all, essential properties of a topological space emerge on studying connectivity. 
We have already defined a connected topological space: one which cannot 
be expressed as the union of two disjoint open sets. There is another kind of 
“connectivity” property of topological spaces which will prove very important. 
Consider as an example the plane IR? with the unit disc cut out (Fig. 2.1). 


Figure 2.1: IR? with a disc cut out. 


On this space, a loop like 1; (we will give a precise definition of “loops” 
later), which does not encircle the disc, has the property that: 
(i) 4, can be continuously shrunk to a point. 
(ii) 1) can be continuously deformed to any other loop not encircling the disc. 
(iii) 2) cannot be continuously deformed to a loop like l2 which encircles the disc 
once. 

The study of whether loops in a topological space can be deformed into 
others is part of homotopy theory, and is an important tool in characterizing 


22 Chapter 2. Homotopy 


the topology of spaces. In addition to loops, which are topologically circles, 
we can consider subspaces that are topologically higher dimensional spheres 
S", For example, IR? with the unit ball removed has a different connectivity 
property: all loops can be deformed into each other, but 2-spheres in that space 
may not be shrinkable, if they enclose the “hole”. 

So far we have worked at an intuitive level and with familiar spaces. Now 
let us give precise definitions of the objects in terms of which we will formulate 
the study of homotopy. 


Definition: A path a(t) in a topological space S, from xp € S to x; € S, isa 
continuous map 
a:[0,1]J 37S 
such that 
a(0)=29, a(1)=x 


Note that the space [0,1] appearing on the left-hand-side of the definition 
is the familiar closed unit interval in IR, while the space on the right-hand-side 
can be any arbitrary topological space. 

When Sis a space like IR” with which we are familiar, the above defi- 
nition of a path seems quite reasonable. But continuous paths can exist in 
arbitrary topological spaces, even including spaces with finitely many points! 
On reflection this should be no surprise, since our definition of continuity in the 
previous chapter made sense for arbitrary topological spaces. A simple example 
is provided by the following exercise. 


Exercise: Consider the set S = {a,b,c} of three elements, with the topology 
U = (¢,S, {a}, {b}, {a,}). Find a continuous path from a to b. 


It should be kept in mind that a path as defined above is not just the 
image of a map, but the map itself. One should imagine that as the parameter 
t moves through values from 0 to 1, its image in the topological space moves in 
that space, and it is the map between these two motions which we call the path. 
It is convenient to think of the parameter ¢ as a time, then the map gives the 
motion of a point on the topological space as a function of time. For example, 
given a path a(t), we can define a new path ((t) = a(t?) which traces out the 
same image in the same total time, but corresponds to a different map and is 
therefore treated as a different path. 


Having defined paths, we use them to define a new notion of connectedness. 
Definition: S is arcwise connected or path connected if there always exists a 
path a(t) -between any pair of points zo and 2}. 


Theorem: If S is arcwise connected then it is connected. (Recall that we 
defined connectedness in the previous chapter, without recourse to paths in the 
space.) 


Exercise: Prove the above theorem. Assume S is arcwise connected, but not 


2.1. Loops and Homotopies 23 


connected, and find a contradiction. 


A path, as defined above, has distinct end-points in general. But paths 
which close on themselves turn out to be the most useful ones. 


Definition: A closed path or loop in S at xo is a path a(t) for which ro = x), 
that is, a(0) = a(1) = 29. The loop is said to be based at ro (sce Fig. 2.2). 


Notice that a loop is a map from a circle to an arbitrary topological space 
S. However it contains some additional information in the form of the “base 
point” 29. In what follows, we will always deal with based loops, in other words 
loops with a fixed base point. 


t=1/2 a(t) 


= t=1/4 
1=3/4 ao)=x, 


1=0 


Figure 2.2: A based loop in a topological space S. 


We will find it useful to develop the notion of multiplication of loops. Each 
loop is a map from the circle to the space, and the product will again be such 
amap. We only admit multiplication between two based loops with the same 
base point. 

To multiply two loops based at the same point zo, define a map [0,1] - S 
for which, as ¢t goes from 0 to 1, the image first traces out one loop and then 
the other. This is formalised as follows. 


Definition: Suppose a(t), G(t) are two loops based at zo. The product loop 
+ = a B is defined by: 


y(t) = a(2t) 


= B(2t — 1) (2.1) 


It is easy to check that this defines a continuous loop based at x9. Thinking 
of ¢t as time, we may say that in the product map defined above, the image point 
in S moves along the loop q@ during the first half of the total time, and along 8 
in the second half. 


Definition: The inverse loop a~'(t) of a loop a(t) is defined by: 


al(t)=a(1-t), O0<t<1 


24 Chapter 2. Homotopy 


This is just the same loop traced backwards. If we only look at their images 
in the topological space, a loop and its inverse look the same, but as maps they 
are clearly different. 


Definition: The constant loop is the map a(£) = zo, 0< t < 1. The image of 
this map is a single point. 


Now we implement the intuitive idea that some loops based at zo can be 
continuously deformed to other loops based at the same point. Such a defor- 
mation, if it exists, should be a map which spccifies how a given loop “evolves” 
continuously to another one. 


Definition: Two loops a(t) and G(t) based at xo are homotopic to each other 
if there exists a continuous map: 


H:(0,1)x [0,1] -5 


such that 
A(t,0) = a(t), O0<t<l 
H(t, 1) = B(t), O<t<l 
A(0,s) = A(1,s) = x, 0<s<l 


The meaning of this is obvious with a little thought. We have introduced 
a new parameter s € [0,1] which we would also like to think of as a second 
’ “time”. The map A(t,s), for each fixed s, defines a loop in the space S' based 
at Zo. Therefore, as s evolves, the entire loop can be thought of as evolving. At 
the initial value s = 0 the loop was a(t), while at the final value s = 1 it has 
become a diffcrent loop G(t). At every intermediate valuc of s it is some other 
loop, always based at xo. H is called a homotopy. An example of a homotopy 
H(t, s) is illustrated in Fig. 2.3. 


o(t) = A(t, 0) H(t, 1/3) A(t, 2/3) Bit) = A(t, I) 
=0 s=i/3 S=2/3 


Figure 2.3: A homotopy between two based loops a(t) and f(t). 


If two curves a(t), 8(t) are homotopic to each other, we write a ~ (2. 


Theorem: Homotopy is an equivalence relation. This means the following 
conditions are satisfied: 


2.2. The Fundamental Group 25 


(i) @ ~ a (reflexivity) 

(ii) B ~ a & a ~ B (symmetry) 

(iii)a ~ B,B~ y > a ~ 4 (transitivity) 

Proof: Here is a sketch of the proof, which consists of displaying explicitly a 
homotopy # in each case. 

(i) Pick H(t, s) = a(t) for all s. This proves reflexivity. 

(ii) Given H(t,s) : a — 8, define a new homotopy H(t,1-—s):8— a. This 
proves symmetry. 

(iii) Given Hi(t,s):a— 8, and He(t,s): 8 + 7, define H3(t, s) : a — y by 


H3(t, s) = Hy(t, 2s), 
= Hya(t, 2s — 1), 


This proves transitivity and completes the proof. 


Homotopies also satisfy the following additional properties: 
fiv)a~ B & at A gr} 
(V)aX Bal X ff => axal & Bx fr’. 
However, it is not in general true that a ~ a7}. 
Exercise: Find an example of a topological space and a path a(t) in it such 
that a is not homotopic to a~?. Find another example for which @ and a7? are 
nontrivial yet homotopic to each other. 


2.2 The Fundamental Group 


A very important property of an equivalence relation on a set of objects (in 
this case the set of based loops) is that it partitions the set into disjoint classes, 
called “equivalence classes”. Within each class, all elements are equivalent to 
each other under the given relation. 

In the present case, denote by [a] the class of all loops homotopic to a(t) 
(always with reference to a fixed base point). One can check that multiplication 
of classes 


[a] * [4] = [ax A] 


is well-defined. This amounts to showing that the above product defines the 
same class irrespective of which loops we choose as representatives for the classes 
on the left hand side. 


Definition: The collection of all distinct homotopy classes of loops in X based 
at 2 is: 
11 (S,20) = { [a] | a(t) is a loop in S based at zo } 


26 Chapter 2. Homotopy 


m1(5,z9) is a group under multiplication. It is called the fundamental group or 
first homotopy group of S at xo. 


The group property of 7; needs to be proved. We do this by checking the 
group axioms: 
(i) Closure: 
[a] Em, [8] Em => [a] * [8] = [ax A] Em 


(ii) Identity: [i] is the equivalence class of loops homotopic to the constant loop 
at zo. Clearly 


[a] + (i) = fi} * [a] = [oe] = fa] 
for all [a]. 


(iii) Inverse: [a]~! = [a—"], because 
[a]~* * [a] = [a7* * a] = [i] 
(iv) Associativity: To show that 


([o.] * []) * [y] = [ce] * ([6] * [7]) 


is straightforward but requires some thought, and this part of the proof is left 
as an exercise to the reader. 


To procced further, we will find it useful to define the product operation 
even on paths which are not closed, namely, paths a(t) such that a(0) and a(1) 
are distinct. 


Definition: The product of two open paths a(t) and ((t) is defined only if 
a(1) = (0), in other words, if the final point of a is the same as the initial 
point of 8, and is given by + = a @ where: 


y(t) =a(2t), O< 


As one might expect, the initial point of the product path ax # is the initial 
point of a, and the final point of a @ is the final point of (. : 


Now returning to the fundamental group, we note that in principle this 
group depends on the base point with respect to which it is defined. Thus, 
7(S, zo) is a different group from 7(5,z), for zo # 21. But in fact under very 
general conditions the two are the same. 

To specify what we mean by two groups being the “same”, let us first 
define the concept of “homomorphism” between groups (not be confused with 
“homeomorphism” between topological spaces!). This is a mapping from one 


2.2. The Fundamental Group 27 


group to another which preserves the group operations. 


Definition: If G and H are two groups, a homomorphism ¢: G — H is a map 
g9 € G— ¢(g) € A such that: 


o:g  €G= dg") =(¢(9)) EH 
$: 91°92 € G=> $(91 - g2) = $(91) - O(g2) E 
¢:1€G> i) = EH 


(here “¢ :” should be read as “@ maps” (a given element of G to an element of 
#)). 


Definition: A homomorphism ¢ is an isomorphism if it is also bijective (one- 
to-one and onto). If an isomorphism exists between two groups, the groups are 


said to be isomorphic, and can be thought of as completely equivalent to each 
other. 


We now show that the fundamental group of a topological space is inde- 
pendent of the base point under some conditions. 


Figure 2.4: Isomorphism of ¢;(S,2q) and ¢:(S,21) for a path-connected space. 


Theorem: If a topological space S is path-connected, 7(S,29) and 7 (5,21) 
are isomorphic as groups. 


Proof. This is illustrated in Fig. 2.4. Consider a(t) based at zo along with its 
equivalence class [a] € 71(S,x0). Let y(t) be an open path from x9 to 21. Now 
define a map 


$:m™(S, 20) > ™m(S,21) (2.4) 
by 
(al) = [y7* + a*y] (2.5) 


Clearly [a] € m7 (S,20) = ¢(Ja]) € 71(S,21). And moreover this is an isomor- 


28 Chapter 2. Homotopy 


phism. For example, the product law holds: 


$({a] * [6]) = O([a * 8) 


= [yl eax By] 
= [y eax [y «Bey 
= o[(a)] « 4((4]) (2.6) 


with [a], [6] € 71(S,20). Similarly one can check the other properties of a 
homomorphism, as well as the fact that this map is bijective which finally makes 
it an isomorphism. 


Because of the above theorem, for path-connected spaces we denote the 
fundamental group as 71(S) instead of 7(S,29). It must be kept in mind, 
however, that in the process of finding 71 we must always work with loops based 
at some (arbitrary) point x9. The final result for 7 will then be independent 
of To. 


2.3 Homotopy Type and Contractibility 


So far we have discussed maps from the closed interval [0, 1] to a topological 
space S, and defined two such maps to be homotopic if there exists a suitable 
map from [0, 1] x [0,1] to S. This has a straightforward generalisation involving 
two topological spaces. 


Definition: Let S and T be two arbitrary topological spaces, and consider two 
different continuous maps 


fo: sS-T 
fi: SOT (2.7) 


The maps fo and f; are said to be homotopic to each other if there exists a 
continuous map 


F:S$@[0,1) -T (2.8) 
such that 


F(z,0) = fo(z) 
F(z,1) = fi(z) (2.9) 


In the special case where S is the closed interval (0, 1] on the real line, this 
homotopy of maps reduces to homotopy of paths or loops, but one should note 
that there is no reference to a base-point. So we will use the symbol ~ to denote 
the homotopy of maps defined above (which does not involve a base point), as 
against ~ which always denotes the homotopy of based loops. 

Homotopy of maps between general topological spaces is useful because 
it helps us identify properties that the two spaces may have in common. For 


2.3. Homotopy Type and Contractibility 29 


example, let us find the condition that two different path-connected spaces S 
and T have the same fundamental group: m(S) = ™(T). 


Definition: Two topological spaces S and T are of the same homotopy type if 
there exist continuous maps 


f{: S3T and g: TOS 
such that (we use the symbol “o” for composition of maps): 


fog: T > Tri 
goof: S + S~ ig 


where ir is the identity map T — T and similarly for igs. 


The property of “being of the same homotopy type” guarantees that 7(S) 
and 71(T') are the same group. This is embodied in the following result. 


Theorem: If S and T are two path-connected topological spaces of the same 
homotopy type, then 7(S) is isomorphic as a group to ™(T). (The proof is 
somewhat complicated and we will skip it.) 


Theorem: If S and T are homeomorphic as a topological spaces then in par- 
ticular they are of the same homotopy type, and hence have the same 7. This 
is obvious from the above, since a homeomorphism is a pair of continuous maps 
f: S—T,g: TS such that g = f7}, ie. 


fog=ir, go f =itg. (2.10) 


Thus the theorem is true. 


Summarizing, we have found out two important facts: 


(i) Homotopy is a topological invariant. Two spaces which are topologically 
equivalent (homeomorphic) have the same homotopy properties. 

(ii) However, the converse is not necessarily true. Two topological spaces may 
have the same 7 (if they are of the same homotopy type), but this does not 
imply that they are homeomorphic. 


The following definitions, theorems and examples tend to crop up fairly 
often in physical applications. 


Definition: A topological space is contractible if it is of the same homotopy 
type as a single point. 

Definition: A topological space S$ for which 71(S) = {2} is called simply con- 
nected. Otherwise it is called multiply connected. 


A simply connected space has no “nontrivial” loops, in other words all loops 
are deformable, or homotopic, to each other. 


30 Chapter 2. Homotopy 


For a contractible space S, the fundamental group 7(S) = {7}, the identity 
element. Therefore it is simply connected. However a simply connected space 
need not be contractible, as we will see in examples below. 


Theorem: 17;(x ® y) = ™(x) ® m(y). Here the direct sum of groups, @, is 
their Cartesian product as sets: 


m1(z) ® mi(y) = { (a,6) | a € mi (x), bE m(y) } (2.11) 
The proof of this theorem is quite simple and is left as an exercise. 


Examples: 

(i) IR” is contractible. To show this, we need to find continuous maps f : IR® 4 
{p} and g : {p} — IR”. Clearly the only available choices are f(#) = p, the 
constant map, and f(p) = 0, where 0 is some chosen point in IR” (which we 
may call the origin). Then, go f : IR” — IR” is the map 


go f (Z)=0 (2.12) 


Define a homotopy F : IR” x [0,1] - IR” by F(z,t) =tZ. For t = 0, this is the 
constant map go f which sends all points to 0. For ¢t = 1, it is the identity map 
imn. Thus go f ~ ign. Of course f © g is trivially ~ i. Thus we have shown 
that IR” is of the same homotopy type as a point, and hence contractible. It 
follows that 7 (IR”) = {i}. (This is also intuitively obvious, for any loop in IR” 
is homotopically trivial.) 


(ii) S? is not contractible (this is illustrated in Fig. 2.5). To prove contractibility 
of a space, we must find a way to “move” all its points continuously to one point. 
On S? we could try to move everything to the south pole along great circles, but 
then the north pole cannot move in any direction without breaking continuity! 
Nevertheless, any based loop on S$? can be continuously deformed to a point, 
therefore 71(S*) = {2} and S? is simply connected. 


We see that from the point of view of contractibility S? is nontrivial, but 
from the point of view of the fundamental group it is trivial. 


(iii) $1, the circle, is not contractible. The proof is just as in the previous 
example. But it also has a nontrivial fundamental group, unlike the previous 
case. To show this, let a,(t) be a loop which winds n times around the circle 
in an anticlockwise direction for n > 0, and |n| times clockwise for n < 0. For 
n = 0, take the constant loop, namely, a point zo € S!. Clearly [an] 4 [am] for 
m #n, and the collection [a,], n € Z exhausts all homotopy classes. Under 
multiplication, 


[On] * [Om] = [On+m] (2.13) 


Thus, 7(S") = { [an], n € Z }, with [an] * [am] = [ontm]. Clearly this group 
is isomorphic to the integers under addition. The isomorphism is: 


[an] €m(S?1) > nEeZ (2.14) 


2.3. Homotopy Type and Contractibility 31 


N 


Figure 2.5: S$? is not contractible: we can move all points except the north pole 
to the south pole. 


The isomorphism permits us to write 7(S1) = Z. The number n labelling the 
class to which a given loop a(t) belongs is called the winding number of the 
loop. 


Figure 2.6: A closed loop in S?/0. 


(iv) Consider the two-sphere S? with diametrically opposite points identified. 
This is the first seriously nontrivial space we are studying! If the identification 
map is called @ then we may think of the space as the quotient S?/@. A rep- 
resentative set of points is the upper hemisphere, which is identified with the 
lower hemisphere, as well as half the equator, which is identified with the other 
half. 

Now all loops which were closed in S? will remain closed in S?/@ and of 
course they are still shrinkable. However, there are paths which are open in S? 
but closed in $?/8. For this, consider any path that starts at the north pole 
and ends at the south pole, as is illustrated in Fig. 2.6. This path is closed in 
S?/0 because the north and south poles are identified, but it is not shrinkable 
to a point. 


32 Chapter 2. Homotopy 


N N N 


moe Ne NL 


Figure 2.7: a is deformable to a1 on S?/8. 


oS 


Now let us show that this loop a is homotopic to its inverse a~! (see 
Fig. 2.7). Simply keep the end points fixed and deform the path so that it goes 
down the other side of the sphere. By the identification map, this is the same as 
a path going upwards from S to N along the front of the sphere, namely, a7!. 
Thus we have shown that in this case, 


[a] = (a7?) = fa)? (2.15) 
So 
[a] * [a] = [a7] = [a] * [a]~? = [i] . (2.16) 
It is clear that a shrinkable loop based at N (which is closed already in S$?) and 
the loop a(t) we have just described (which is closed only on $?/8) represent 
all the homotopy classes of $?/8. Thus 
m(S°/8) = { [i la), | [a] * fa] = Ee] } (2.17) 


Under the homomorphism: 
[i] — 0, [ao] > 1 (2.18) 


this becomes the additive group Z2 of integers modulo 2. This is an example of 
a finite homotopy group: 7(S*/@) = Zz. 

(v) T? = S! x $1. This is the 2-torus. As visualized in our own 3-dimensional 
world, this looks like Fig. 2.8. 


To find its fundamental group, we use the theorem stated earlier, about 71 
of a product space. Thus we have 7(T?) = Z@ Z. Similarly, for the n-torus 
T* = S'@S'®...@S! (n times), we have 7(T”) = ZOZ@...@Z (n times). 

For the two-torus, a homotopy class is labelled by a pair of integers (ni, n2) 
which described the number of times a loop winds around the first and second 
S? respectively. 


Exercise: Describe some loops on the torus which lie in the homotopy classes 
(1,0), (0,1), (1,1). 


Exercise: Find a space S with 7(S) = Z ® Z2 (recall the theorem on 7 of a 


2.3. Homotopy Type and Contractibility 33 


Figure 2.8: The two-torus S! x S!. 


product space). 


(vi) IR? — B?: this is the plane with a disc cut out of it. We can find its 
directly, but instead let us show that this space is of the same homotopy type 
as S!, This is a nice example of two spaces which are not homeomorphic but 
are of the same homotopy type. 


Figure 2.9: IR? minus a disc is of the same homotopy type as S?. 


This is illustrated in Fig.2.9. Define f : IR? — B? — S! to be the map 
which sends every point 7 € IR? — B? to the corresponding unit vector 2, which 
lics on S!. In other words, simply project the point down along the line joining 
it to the origin until it reaches the unit circle, which defines its image in S!. 
The reverse map g : S! — IR? — B? takes all points of the circle S! and puts 
them on the boundary of the disc removed from IR?. Clearly this is a homotopy 
of topological spaces. We can image “continuously shrinking” IR? — B? down to 
the circle. Thus 7 (IR? — B?) = m,(S') = Z. 


(vii) IR? minus two discs (see Fig.2.10). 


We will show that 7 of this space is non-abelian. Consider the two loops 
a, § based at 29 as in the first figure of Fig. 2.10. Clearly they are not homotopic 
to each other, so [a] 4 [G]. Now consider the products [a] *[@] = [a* 6] = 7 and 
[6] * [a] = [2 xo] = 6 in the same figure. It is evident that 7 4 6, and therefore 
[a] * [6] 4 [G] « [a]. Thus 71(S) is a non-Abelian group in this case. 

This group is known as the free group on two generators, which simply 
means its elements are abstract products of two independent gencrators ar- 
ranged in any order, with different orderings representing independent elements 


34 Chapter 2. Homotopy 


B a % 1 
8 


Figure 2.10: Two based loops §, a; a loop ¥ in the class [a x §]; a loop 6 in the 
class [8 * a] (in IR? with two discs removed). 


of the group. 

To see that 7, can be non-Abelian in general, consider multiplication of 
classes of based loops [a], [6] in the two possible orders as depicted in Fig. 2.11. 
Keeping the base point fixed, one cannot in general deform the second diagram 


[—+—_] 
0 in 1 
B 
Bra: a 
a 
—— 
0 wr 1 


Figure 2.11: The product of loops a, in different orders. Clearly the two 
products are not deformable into each other. 


of Fig. 2.11 into the first. This is the basic reason why 7(S) is non-Abelian in 
general. 


2.4 Higher Homotopy Groups 


We have seen that the fundamental group 7(S) is the collection of homotopy 
classes of continuous maps from the closed interval (0, 1] to the space S, with 
the requirement that the map be “based” at a given point zo. This requirement 
is implemented by the condition that the end-points of the closed interval be 
mapped into a common point in S. In this section, we generalize the notion of 
loops to higher-dimensional objects mapped into a topological space. This will 
allow us to define more homotopy groups associated to a given space. Each one 


2.4. Higher Homotopy Groups 35 


could then potentially capture some new topological information, which would 
lead us closer to the goal of understanding the topology of a space through the 
study of homotopy. 

A simple generalization of a based loop is obtained by considering maps 
from the closed unit square [0,1] @ [0,1] to $, with the requirement that the 
entire boundary of the square be sent to a common point in X. Thus consider 


ea: [0,1] @ [0,1] 4S (2.19) 


given by a continuous function a(t), tz) where ¢; is in the first [0, 1] and tg is in 
the second [0, 1], such that 


o(0,t2) = a@(1,t2) = a(1,0) = a(t, 1) = zo. (2.20) 


Figure 2.12: A based “two loop” maps the boundary of (0, 1] x (0, 1] to a single 
point x. 


Such a map may be considered the two-dimensional analogue of the based 
loops described earlier. Let us call it a 2-loop. Now a homotopy between two 
2-loops a(t1,t2) and G(t),t2) based at the same point zo is defined by: 


H: [0,1)@ (0,1) @[0,1] — $ (2.21) 
H(s,¢1,t2) is a continuous map with the following properties: 
(0, #1, t2) = a(t1, te) 
A(1,ti,t2) = A(t, t2) 
H(s,t,,¢2) = Zo for alls. (2.22) 


(t; ort2 =Oor 1) 


This map describes a continuous deformation of one 2-loop into another, anal- 
ogous to what was done for loops in standard homotopy. 

All this is easily generalized to “n-loops”, though one should keep in mind 
that “n-loop” is not standard mathematical terminology. 


Definition: An n-loop in X based at zo is a continuous map: 


a:(0,1]"—~ xX 


36 Chapter 2. Homotopy 


such that 


a(t1,---,&n) = Io 


(any ¢; =0 or 1) 


Here, the nth power of the interval [0,1] denotes its Cartesian product with 
itself n times. 


Definition: A homotopy of n-loops o{t1,...,tn) and B(t1,...,t,) is a continu- 
ous map 
H: [0,1)"*' 3S 


such that 
H(s;t1,..-stn)| = a(ti,...,tn) 
s=0 
H(syt1,.--stn)] = Altay --+s tn) 


A(s,t1,...,tn) = £9 for all s 


any t;=0or1 


This homotopy is an equivalence relation, as one can easily check. With 
the above definitions, we follow the predictable path of defining the product of 
generalised n-loops, leading finally to the concept of the nth homotopy group 
Tn of @ space. 


Definition: The product of two n-loops a, f is given by 


a(ty,...,tn) * B(ti,...,tn) = a(2t1,te,...,tn); St< 


IA 
=e Nie 


0 
1 
= B(2t, —1,te,...,tn), 3 t< 


In a previous section on loops, we had constructed the inverse of a loop 
by tracing the loop backwards. Now that we are dealing with “n-loops”, an 
analogous operation can be defined by simply tracing backwards in the first 
argument ¢, of the loop, leaving the others unchanged. 


Definition: The inverse of an n-loop a is given by 


a7 *(t1,...;tn) = a(1 — ty, ta,...,tn) 


With all these properties it becomes evident that equivalence classes of 
based n-loops form a group, just as for simple loops. 


Definition: The nth homotopy group of the space S, with base point zo, denoted 
Tn(S, Zo), is 


™(S, Zo) = { [a] | a is an n-loop based at zo } 


As before, [a] denotes the equivalence class of all loops homotopic to a, 
and the group operations are just as for 1-loops. In particular, the identity 


2.4. Higher Homotopy Groups 37 


element of the group corresponds to the homotopy class of the constant loop, 
a(t1,...,;tn) = Zo for all t;. For path-connected spaces, 7(S, zo) is independent 
of zp as one would expect. 


Example: 72(S) is the group of homotopy classes of 2-loops. A 2-loop is just 
the image of the 2-sphere S$? in the given topological space. It is easy to sce 
that in IR°, all 2-spheres can be deformed to a point, so 72(IR*) = {i}. On the 
other hand in IR? — {0}, a 2-sphere enclosing the origin cannot be shrunk. So 
aq (IR? — {0}) is non-trivial. In fact (although it is not as easy to see intuitively 
as for the analogous case 7 (IR? — {0})), it turns out that 


mo(IR3 — {0}) =Z (2.23) 
Before giving more examples, let us state an important theorem. 


Theorem: The homotopy groups 7,(S),n > 2, are Abelian. 


Proof: Instead of giving a formal proof let us use diagrams to indicate how 
this works. The relevant diagram for 72 is Fig.2.13. Let us see how ax B 


asp: 


Bra: 


Figure 2.13: The product of two-loops a, @ in different orders. 


can be homotopically deformed to 6 x @ in this case. Note that a and @ are 
constrained to map all the boundary points to zo. Thus the product a* @ maps 
the boundary of the rectangle to zo. The trick now is to continuously modify the 
parametrisation of the map a*f while leaving its image in the topological space 
unchanged. For a 1-loop we could only speed up or slow down the “traversal 
speed” in terms of the parameter t, of each part of the loop. But here, with two 
parameters, we can do something more drastic. 

We start by homotopically deforming a*@ to map many more points in the 
rectangle to zo, namely, all those within some distance of the boundary. This 
is illustrated in Fig. 2.14, where the whole shaded region is mapped to zo. This 


38 Chapter 2. Homotopy 


arp 


Figure 2.14: The map a*f deformed so that the entire shaded region is mapped 
to the same point 2. 


operation is continuous as long as the regions marked 1 and 2, over which the 
map is non-constant, are non-empty regions. 


Next, perform the sequence of operations shown in Fig. 2.15. Clearly all the 


1 
|: |-i- 7 
2 


Figure 2.15: Proof that m2 is Abelian. 


operations are continuous and during the whole process, the image of the map 
remains the same. But at the end we have managed to change the parametri- 
sation of the map from that of a * @ to that of 8*a. Thus the two 2-loops are 
homotopic and belong in the same equivalence class. Hence, 


[a] « [G] = [a x B] = [Bx a] = [8] « [a] (2.24) 
and we have shown, as desired, that 72(S) is an Abelian group. 


Exercise: Convince yourself that the same procedure does not work for 7, 
and that it works for all n > 2. 


To conclude this section, we state without proof a few uscful results which 
tend to occur in problems of physical interest. 
(i) m(S") = Z 
(ii) tm(S") =O0,n<n 


2.4. Higher Homotopy Groups 39 


(iii) 43(S?) = Z 
The interested reader should consult the mathematical literature for proofs of 
these and many more results on homotopy groups. 


All homotopy groups 77, (S) of a topological space are invariant under home- 
omorphisms, and hence should be thought of as topological invariants charac- 
terising the space. They cannot be changed except by discontinuous (non- 
homeomorphic) deformations of the space. 


Chapter 3 


Differentiable Manifolds I 


Starting with a topological space, it is sometimes possible to put additional 
structure on it which makes it locally look like Euclidean space IR” of some 
dimension n, On this space, by imposing suitable requirements, it will then be 
possible to differentiate functions. The space so obtained is called a differentiable 
manifold. This notion is of central importance in General Relativity, where it 
provides a mathematical description of spacetime. 


3.1 The Definition of a Manifold 


Among the topological spaces we have studied, the Euclidean space R” = R® 
R®...®RR is special. Besides being a metric space and hence a topological 
space with the metric topology, it has an intuitive notion of “dimension”, which 
is simply the number n of copies of R. 

Let us concentrate on two simple cases: IR, the (one-dimensional) real line, 
and IR?, the (two-dimensional) plane. If the dimension of a space is to be 
topologically meaningful, two different IR”, IR” should not be homeomorphic 
to each other, for m # n. This is indeed the case. 

Let us demonstrate that IR is not homeomorphic to IR?. Assume the con- 
trary: suppose there is a homeomorphism (recall that this means a 1-1 contin- 
uous function with continuous inverse) f : IR — IR?. Consider the restriction 
of this function: f : IR — {0} — IR? — {0} where we have deleted the point {0} 
from IR, and its image under f, which we define to be the origin, from R?. 

Now if f : IR — IR? is a homeomorphism then so is f : R—{0} — IR? — {0}. 
But R — {0} is disconnected, while IR? — {0} is clearly still connected. Since 
connectivity is invariant under homeomorphism, this is a contradiction. Thus 
the original homeomorphism f : IR > IR? does not exist, and R is not home- 
omorphic to IR?. Similarly, one finds that IR” and IR™ are not homeomorphic 
for m #4 n. So “dimension” is indeed a topological property. 


42 Chapter 3. Differentiable Manifolds I 


Now consider two more examples of topological spaces: the circle, 
S'={ (cy) €R? |e? +y? =1} (3.1) 
and the 2-sphere, 
S? = { (2,y,z) ER? | 22 +4? +2? =1} (3.2) 


S? and S? inherit the relative topology as subsets of IR? and IR? with the usual 
topology. Intuitively, we see that S! and S? are 1-dimensional and 2-dimensional 
respectively. But in what sense can we make this precise? Certainly they are 
not homeomorphic to IR and IR? respectively. 


Theorem: 5S! is not homeomorphic to R. 


Proof. Assume a homeomorphism f : S' — IR. Delete a point and obtain 
f :5'— {0} — R— {0}. But S' — {0} is connected while IR — {0} is not. So S} 
and R are not homeomorphic. Similarly S?,IR? are not homeomorphic. (In this 
case, both remain connected after deleting a point, but S? also remains simply 
connected, while IR? does not.) 


Then what is the common property of S! and IR, or S? and IR?, which 
suggests that they have the same dimension? It is that they are locally homeo- 
morphic. Let us now make this precise. 


Definition: Let M be a topological space. A chart C on M is a collection 
(U, @,n) such that: 

(i) U is an open set of M. 

(ii) ¢ is a homeomorphism: U C M — ¢(U) C R”. (Thus, ¢(U) is open in 
IR”.) 

(iii) n is a positive integer, called the dimension of the chart C. 


In plain language, a chart is a piece of the topological space, with a home- 
omorphism equating it to a piece of some Euclidean space. 

It is evident that IR” comes with a preferred choice of coordinates, called 
Cartesian coordinates, namely the n real numbers specifying each point. Now 
given a chart, each point on it can be labelled by considering the image of this 
point in IR” and then using the Cartesian coordinates of that point. 


Definition: Given a chart C = (U,¢,n), the coordinates of a point p € U C M 
are the Cartesian coordinates in IR” of the point ¢(p) € ¢(U) C IR”. When 
we allow the points p to vary over U, the Cartesian components of ¢(p) in IR” 
define a set of n real-valued functions on U € M called coordinate functions. 


The concept of chart has enabled us to locally map a piece of a topological 
space to IR”. A function on this piece of the topological space can then be 
differentiated by simply differentiating, in the familiar sense, the components of 
the coordinate functions defined above. 

Now we would like to cover the entire topological space with charts so 
that we will be able to differentiate functions over the entire topological space. 


3.1. The Definition of a Manifold 43 


Necessarily the open sets U will overlap with cach other (since a topological 
space, unless it is disconnected, cannot be expressed as a union of disjoint open 
sets). The notion of differentiability which the charts embody should then be 
the same for two overlapping charts in their region of overlap. This leads us to 
the following definition. 


Definition: Let C; = (U:,¢1,n),C2 = (U2, d2,n) be two charts on Af. Then 
C, and C2 are “C™-compatible” or simply “compatible” if: 

(i) Uy NU2 = @, or 

(ii) the maps ¢1 043! : ¢2(U1 NU2) > o1(U1 U2) and $2071 : d1(U1NU2) > 
¢2(U; U2) are infinitely differentiable (C) functions. 


Note that ¢! 0 ¢y! and ¢2 0 ¢;' are continuous maps from subscts of IR” 
to subsets of IR”. This can be represented as in Fig. 3.1. 


(0,9 U2) 
$, (UU 2) 
v4 6.06)" 
las. 
NEY” 
9,9 o;" 


>,U)) CR 
$,(U,)¢ R” 


Figure 3.1: Definition of a manifold. 


Thus, two charts are compatible if the functions between subsets of IR” 
shown in the diagram can be differentiated an arbitrary number of times. Now 
we are in a position to cover the space with charts. 


Definition: A C®-atlas U on a topological space Af is a family of charts 
(Ua, Pan) which covers M, so that U.Uq = M, such that all the charts in the 
family are mutually C™-compatible. 


Definition: Two C™-atlases are said to be compatible (with each other) if every 


44 Chapter 3. Differentiable Manifolds I 


chart of one atlas is compatible with every chart of the other atlas. 


Theorem: For atlases, C©-compatibility is an equivalence relation. 

The proof is useful in giving the reader an idea what an atlas really means, 
and is left as an exercise since it is straightforward. On the way, the alert 
reader will discover that compatibility of individual charts is not an equivalence 
relation. 

Basically, an atlas is a collection of charts that covers the space, such that 
we can go from one to the other (this really means going from one subset of IR” 
to another) by a differentiable function. The charts and their collections into 
atlases form the building blocks of the concept of “differentiable structure” on 
a topological space. A space with such a structure will be called a differentiable 
manifold. 


Definition: A differentiable structure of class C™ on a topological space M is 
an equivalence class of C°°-compatible atlases on M. 


It is always easier to study equivalence classes if we can find a unique 
representative of each class in some way, since this can then be used to label the 
class. We may pick a unique atlas out of each equivalence class of compatible 
atlases as follows: take the union of all atlases in a class and call it the mazimal 
atlas. Then a differentiable structure is just a choice of maximal atlas U on M. 
This choice therefore labels the differentiable structure. 

We are finally in a position to define a manifold. 


Definition: A(C™) differentiable manifold is the pair (M,U) of a Hausdorff 
topological space M and a C® differentiable structure U. 


From this it follows that a differentiable manifold has the following prop- 
erties: 


(i) It is locally Euclidean. 

(ii) It is locally compact (every point z € M has a compact neighbourhood). 
(iii) An open subset U € M is itself a differentiable manifold, called an “open 
submanifold”. 

(iv) The product of two manifolds is well-defined (using Cartesian products). 


In general terms, the definition of differentiable manifold says that each 
local region of the space looks like a local region of IR", and that the many local 
regions on M are “patched up” by piecing together local regions on IR” using 
differentiable functions. 


Examples: 

(i) IR” is obviously a differentiable manifold. All of it can be covered by a single 
coordinate chart, and the coordinate map is the identity. 

(ii) S! is a manifold. We cannot choose all of S! to be a single chart, since this 
is not homeomorphic to any open set of any IR”. So let us cover it with two 
overlapping charts. Since S! is defined as a subset x? + y? = 1 of IR’, it inherits 


3.1. The Definition of a Manifold 45 


the relative topology. A basis is given by the usual open intervals. 
y 
Is (1,0) (1,0) 
x 


Ui: { (x,y) € S* | (x,y) ¥ (1,0) } U2: { (x,y) € S| (x,y) A (-1,0) } 


Figure 3.2: Coordinate charts for the circle, S?. 


Now to define S' as a manifold take two open subsets of S', as shown 
in Fig.3.2. One consists of the entire space minus the point (1,0) while the 
other consists of the entire space minus the point (-1,0). Next define the 
homeomorphisms: 


&:U.>R, di(e,y)=tan’==6;, 6; € (0, 2m) 


¢2: U2, bo(x, y) = tan? = = 62, 42 € (—7,7) (3.3) 


Now we must look at the overlaps U; M U2. There are two such regions: the 
upper half of the circle and the lower half of the circle. On the upper region we 
have: 


a(t ia) U2) upper = (0, Tt) 


$2(U1 NU2)upper = (0, 7) (3.4) 
and ¢2 0 ¢;' : (0,7) — (0,7) is the identity map: 
$206, (81) = 1. (3.5) 


on the lower region, we find: 


o1(11 nN U2)tower = (x, 27) 


o2(Ui N U2)iower = (-7, 0) (3.6) 
So, 62° 4,1: (n, 27) — (—7,0) is the map: 
g20° ¢, ' (01) = 6, —2n (3.7) 


Both 6; and 6; — 27 are functions that can be differentiated arbitrarily many 
times (in this simple case they are just linear functions!), This shows that S! 
is a differentiable manifold. 
n+l 
(c) S$”, the n-sphere, defined by }> x? = 1 in R®*’, is a differentiable mani- 
i=l 
fold. This can be shown using stereographic projection. Let us work this out 
pictorially and then algebraically. 


46 Chapter 3. Differentiable Manifolds I 


On S?, choose the open sets (in the topology induced from IR?) to be 
U, = S?—{north pole}, Uz = S?—{south pole}. We map these onto IR? by 
stereographic projection as illustrated in Fig.3.3. Each point p € $?—{north 
pole} goes to a corresponding point p’ € IR’. For the other chart U2, projection 
is done through the south pole. The projection i is clearly a homeomorphism. In 
this case the image of the projection in IR? is the whole of IR? rather than a 
subset, which is fine since the whole space is always an open set of its topology. 
Thepicture enables one to visualise the maps ¢1,¢2 and convince oneself that 


61° ¢2',¢20¢;! are "C™, 


Figure 3.3: Stereographic projection of S? onto IR?. 


This pictorial construction is hard to generalise to S” for n > 2 but only 
because our imagination does not work in high dimensions! However, the alge- 
braic description of stereographic projections does work in all dimensions. Let 
us work it out explicitly for S!, since the construction is analogous for all S”. 
Indeed, this will give us another way of specifying coordinate charts on S!, 
besides the one described in a previous example. 

From Fig. 3.4, the point p € S! is pica onto the point: 


gi(p) = 


3.8 
- (3.8) 
This is defined for @ # 0, that is, p € Ah eae M;, is the chart excluding @ = 0. 
For the opposite stereographic projection (upwards from the south pole), we 
will get 
2a 
d2(p) = at 


which is defined for 6 4 7, hence for p € Mo. 
Now consider ¢2 0 ¢;' : IR IR. We have: 


= 2atan : (3.9) 


o;'(0,27) = 2cot?Z 
g2°¢;' = 2atan(cot~! 2) (3.10) 
= @ 


z 


This is the inversion map: IR — {0} — IR— {0} which is infinitely differentiable. 
The map ¢; 0 ¢5 1 works similarly. 


3.2. Differentiation of Functions 47 


Figure 3.4: A simpler case: stereographic projection of S! onto R. 


3.2 Differentiation of Functions 


One useful outcome of our definition of differentiable manifolds is that we can 
now differentiate functions from one manifold X to another manifold Y. The 
idea is simple. Take a chart on X and map it to an open set of some IR”. Take 
the image of this chart in Y under the given function and map it to an open 
set of some IR™. Then composing the coordinate maps and the given function 
appropriately, we get a function from a region of IR” to a region of R™, which 
we can then differentiate. 

Let us work this out. Take two manifolds X and Y and a function f : 
X — Y. The coordinate map ¢ takes an open set U of X to an open set 
¢(U) Cc IR”. Similarly the coordinate map w takes an open set V of Y and 
maps it to an open set ~(V) C R”™. The situations is illustrated in Fig. 3.5. 
Clearly the function yo f o¢~! takes us from ¢(U) to ¥(V). If this function is 
C@ (infinitely differentiable) for all U,V, then we say that f: X — Y isa Cc™ 


map. 
x Y 
f 
a 
b y 
ou) Wofode! wv) 


Figure 3.5: Differentiation of a function f: X —- Y. 


In the previous section we defined two kinds of coordinates on the topolog- 


48 Chapter 3. Differentiable Manifolds I 


ical space S1, one by a simple map of segments and the other by stereographic 
projection. Have we defined two different differentiable manifolds? To answer 
this, we need to define what we mean by equivalence of two differentiable man- 
ifolds. Clearly they must, for a start, be homeomorphic as topological spaces. 
But the differentiable nature must also be preserved. 


Definition: Two manifolds X,Y are said to be diffeomorphic if there exists a 
homeomorphism f : X — Y such that f is a C™ function with a C™ inverse. 
f is called a diffeomorphism. 


If two differentiable manifolds are diffeomorphic, we think of them as the 
same manifold. 

As an example, let us reconsider S!, which was discussed in Examples (ii) 
and (iii) of the previous section. In (ii), the coordinate on one patch was labelled 
@. In (iii), the coordinate was rege Thus we have a map, 


2a 
_—z 3.11 
tan g ( ) 
which (on (0, 7)) is C® with a C™ inverse, hence it is a diffeomorphism and the 
two descriptions are diffeomorphic to each other. Thus the two ways in which 
we defined S! gave rise to the same differentiable manifold. 


Theorem: S”,n < 6, admits a unique differentiable structure. But for n > 7 
there are many possible inequivalent differentiable structures. (This is a highly 
non-trivial result!) 


3.3 Orientability 


For 2-dimensional manifolds visualised in 3 dimensions, we have the intuitive 
notion of two different sides of the manifold. For IR’, we often say “out of 
plane” and “into the plane”, while for S? there is an “outside” and an “inside”. 
This is an embryonic form of the notion of orientation and orientability, which 
we will define precisely in a moment. 

First, consider the strange example of a two-dimensional manifold which 
has only one side. This is called the Mobius strip. It is constructed by taking 
the rectangle { (x,y) € R?|0<2<a, 0<y <b} and identifying a pair of 
opposite sides with a “twist”, in other words, joining them so that the arrows 
match in Fig. 3.6. If we now take a normal vector and call it “outward”, then on 
transporting it once around the strip, it comes back pointing “inward”, namely 
its direction has been reversed. There is only one “side” to the Mébius strip! 


To formalize this notion for general manifolds, consider two overlapping 
coordinate patches (Uz, ¢a) and (Ug,¢g). Consider the map, 


da ° 3. > @p(Ua NUg) > ¢a(Ua N Ug) (3.12) 


3.3. Orientability 


Figure 3.6: How to make a Mobius strip. 


As a map IR” — R”, we can describe this by a function 
f:x — y'(£1,..+,2n) 
The Jacobian of this map is the determinant: 


1=||550| 


49 


(3.13). 


(3.14) 


Definition: A manifold is orientable if we can choose an atlas on it such that 
on every overlap U.NUg of charts, the map f = ba°0Gy! has a positive Jacobian 


determinant. 


Example: The cylinder is orientable. Define the cylinder as S! x (0, 1]. Choose 
the coordinate patches to be U; x (0,1) and U2 x (0,1), where U1, U2 were the 
coordinate patches on S! in Example (b) above. Define the homeomorphisms 


di: U1NU2- R, go: U,NU2 - R (see Fig. 3.7). 


0 t 2n : tn 0 n 
oi (i NU2) x (0, 1)) o2 (OH N U2) x (0, 1)) 


Figure 3.7: Coordinate charts for a cylinder, 


Now the function ¢2 0 ¢7° is: 


do o¢; : (z?, x”) = (z!, x”) O<al<n 


(z!, 2”) > (2! — 27, x”) n<az'<2n 


(3.15) 


In other words y* = y*(z) is: (y1,y?) = (x1, 27) or (y!, y?) = (x! — 27,27). In 
¥y 


each region, we have ge — 63 and J = 1, so the cylinder is orientable. 


50 Chapter 3. Differentiable Manifolds I 


For the Mobius strip, we define the manifold by specifying different func- 
tions on the overlaps: 


d206,': (x',2”) > (21,27) O<al<nr 
(x! a?) > (2! — 20,1 — 2”) tix <r (3.16) 


Then J = 1 and J = -1 in the two segments respectively. This proves that the 
Mobius strip is not orientable. 


3.4 Calculus on Manifolds: Vector and Tensor 
Fields 


One of the main purposes of defining a differentiable manifold is that, given its 
local equivalence to IR", we can try to do calculus on the manifold just as we do 
in IR”. All we have to do is use coordinate charts to get from the manifold to a 
subset of IR", after which we do the usual calculus there. The only subtlety is 
that, since the manifold in general requires more than one chart to cover it, we 
have to learn how to transfer our calculations across charts. 

Let us work with real-valued functions on a manifold M, namely maps 
f:M —R. If not otherwise specified, “function on M” will always means a 
real-valued function. A simple example is the temperature at each point on the 
surface of the earth, which defines a real-valued function f : S? — R. 

Next given such a function f, consider, in a coordinate patch, the associ- 
ated map IR" — R (see Fig.3.8). We will frequently use the notation 2‘ for 
coordinates on ¢a(Uq). Also, the function fo¢z! in the given coordinate system 
is denoted as: 


f: 2 eR" f(ryeR. (3.17) 
In a different coordinate system, we will have a different function 
fi: yeR?of(y)eER (3.18) 


But these must produce the same map f : M — R for any point in M which 
lies in the overlap of U.,Ug. Thus on the overlap, the relation is 


f'(y’) = f(z’) (3.19) 
with y* = y*(z) given by ¢g 0 ¢5?. 

Now to differentiate f : M — IR, we simply carry this out on f(z) : 
IR” — R. If f(z) is C™, then f : M — R is said to be C™. Differentiating 

f(x) : ¢2(U2) C R” — R, we get a set of functions: 
Of | 
Ox? ° 
Though the function f was coordinate invariant on overlaps, as it must be, we 
now see that its derivative does not enjoy the same property. Indeed, 

Of’ _ Of(x) _ Ax? Of(z) , Af(z) 


fy) =f) = Byi dy? Oy Ox Aa! 


éo(Ue) CR® +R i=1,...,n. (3.20) 


(3.21) 


3.4. Calculus on Manifolds: Vector and Tensor Fields 51 


o(UICR"” 


Figure 3.8: Differentiation of a real-valued function. 


(Here and below, summation is implied on repeated indices.) Instead of coor- 
dinate invariance, we realize that derivatives of functions possess a more com- 
plicated transformation property across patches. This may be thought of as 
“covariance”, in other words, a specific transformation law relating the same 
object in different coordinate patches. 

To understand this better, consider a parametrised curve p(t),p € M. This 
is just some one-dimensional submanifold of M, whose points are labelled by a 
continuous parameter t. In some given patch, this goes to a curve z'(t) € IR”. 
The tangent vector at tp to a curve x(t) C IR” is well-known, from elementary 
geometry, to be dz'/dt |,=1,. (One can think of the tangent vector as measuring 
the “rate of motion” of a point along the curve, as a function of the “time” ¢, 
while it also specifies the instantaneous direction of motion.) But here, dz‘ /dt 
in ¢a(Uq) describes some property of the original curve p(t) C M. Accordingly, 
we define this to be the tangent vector to the curve on M, in the given coordinate 
patch. 


Definition: The tangent vector to a curve p(t) C M in a coordinate patch M 
is dx*/dt where x'(t) are the coordinates of the image ¢a(p(t)). 


On the overlap of two patches, we have 


vs y'(r) 
eigeors (3.22) 
“at? dt — dF dt 
Thus given a tangent vector in one patch, the above rule tells us how to relate 
it to the tangent vector in another overlapping patch. This gives an invariant 
meaning to the tangent vector. We may say that the tangent vector is the 
collection of objects dx?/dt in each patch z', related on overlaps by the above 
rule. This is really what we meant by “covariance” above. 


52 Chapter 3. Differentiable Manifolds I 


Tangent vectors at a point form a vector space. If we have two curves 
p(t), q(t) C M passing through the same point at to, then we can take the sum 
of the tangent vectors to each curve at to to get a new vector in IR”, denoted 


o'(aa) = [SF wo) + F ato) (3.23) 


t=to 
This defines a straight line in IR”: 
xi(t) = 2} + a¥(zo)t (3.24) 


where Zo is the image of the point p(0) on the-manifold. 

This straight line can be mapped back to M, to give a new curve on M 
around the point where p(t), g(t) intersect, using ¢,’. Thus, the sum of tangent 
vectors to two curves passing through a point can be regarded as a tangent vector 
to a third curve passing through the same point. 


Definition: The vector space of tangent vectors at a point p € M is called the 
tangent space T,(M]) at p. 


Note that any collection of numbers a‘ and a coordinate system z* define 
a tangent vector. Here is the construction: 


(i) Pick a point xj € IR” and define any curve z'(t) in IR” such that dz ine 
a’. 

(ii) “Lift” the curve back to M using ¢3! for that coordinate system. 

(iii) In any other coordinate system y*(x), define a’ = ge 
object in the new coordinates. 

Let us now combine the two concepts we have just discussed: differentiation 
of functions on M, and tangent vectors to curves on M. In this way we will be 
able to provide a coordinate invariant definition of a tangent vector. 

Recall how we differentiated a C'° function on M along a given curve. If 
the function on M is f, then along the curve it is f(p(t)), and its derivative 
along the curve is defined to be 4 (p(t). Clearly this defines a new function on 
the same curve. 

Using the coordinate map we can express this as 


af _ df ast 
<= = (it) (3.25) 


Thus, the derivative of f along a curve, at some point, is expressed in terms 
of its gradient Of /Oz' in the chosen coordinate system, contracted with the 
tangent vector at that point. But while both Of /Oz‘ and dz‘ /dt are coordinate- 
dependent objects, when contracted together they give df /dt which is manifestly 
coordinate independent. 

Writing 


a) to be the same 
xo 


a [Ge OO) zal f (3.26) 


3.4. Calculus on Manifolds: Vector and Tensor Fields o3 


we see that ax (p(t)) a is a coordinate-invariant first order differential operator 
which maps any C™ function on a curve in a manifold into another such function 
on this curve. At a point z} = z*(to), the operator is ae lento oe - This carries 
precisely the same information as the tangent vector ae |, So we can as well 
call this differential operator a tangent vector at Tp. 

All we have done is to take our previous definition of tangent vector and 
contract it with @/Oz* to make it a differential operator on functions. By this 
process it also becomes manifestly coordinate invariant. 

So far we have confined our attention to tangent vectors defined at a fixed 
point p € M. These can be written a* 0/Ox' for some constants a*. But now 
consider the differential operator X = a'(x)0/Oz', where a‘(x) are C® func- 
tions. This operator maps C® functions over a whole coordinate patch into 
new C@® functions on the same patch: 

= ain et 
Xi f4XxXfz=alz)r; (3.27) 
We call X a vector field. 


Given two functions f and g on M, we have X(fg) = (Xf)g+f(Xq). In 
fact, this property characterizes vector fields, and we can as well define a vector 
field this way. 


Definition: A vector field is a linear map from C™ functions to C’® functions 
on a manifold M, satisfying X(fg) = (Xf)g+ f(Xq). 


Let us now generalise this concept to tensor fields. Just as vector fields 
_are linear maps, tensor fields will be defined as multilinear maps from sets of 
functions to sets of functions. Consider an ordered set of n functions (f1,..., fn) 
on a manifold M. This set forms a vector space under addition in the obvious 
way. Moreover, we can multiply two elements of this space as follows: 


(fis--+s fn) * (G1y--+s 9n) = (f191,---s fndn) 


Now a map X: (fi,..., fn) ~ X(fi,---,fn), taking each set of functions in 
this space to another set, is called multilinear if 


X(aift,..-,@nfn) = a1..-0nX(fi,---s fr) (3.28) 


Definition: A tensor field of rank n is a multilinear map 
X :(fis.0+y fa) a X(fis---2 fa) (3.29) 
satisfying 


A Aisie-e ta) # (Gtis-s 49a) = 
(X(fa,--->fn)) * (915+ +059n) + (fis-- +s fn) * (X (gi, +++ Gn) (3.30) 


54 Chapter 3. Differentiable Manifolds I 


The nth rank tensor fields form a vector space which we denote T,(M) @ 
T,(M) ®...@T,(M), or simply @,7,(M). This is the n-fold tensor product 
of the tangent space with itself. Since 0/Oz* form a basis for T,(M), a basis 
for the n-fold tensor product space is given by 0/Oz" @ 0/dz? @...@0/Azr. 
Thus any A € ®,T,(M) can be written 

a) 0 a 
tan Oa 8: Paw (3.31) 
Clearly the coefficients A*!*?--*=(x) transform like nth rank contravariant ten- 
sors, in “physics” terminology, and we call A € (T,(M))” an nth rank tensor 
field. 


Now suppose we have two manifolds M and N and a C® map 


o:MoON (3.32) 


A= Airt2---in ( 


This map induces another map, called the differential map 
bs : Tp(M) > Typ) (N) (3.33) 


which maps the tangent space to M at p, to the tangent space to N at ¢(p). 
To define this map, note the following: if f : N — R is a C™ function on N, 
then fog: M — Ris a C™ function on M (Fig. 3.9). 


OO 
ar, 


Figure 3.9: How to define the differential map ¢,. 


A tangent vector acts on functions, as we have discussed above. Given 
a vector X € T,(M), we define ¢,X € Typ)(N) to have the same action on 
f:N— Ras X €T,(M) would have on fod: M > R. 


(O.X)f = X(f o¢) (3.34) 


Notice that for given X and f, both sides of this equation are just real numbers. 
The above equation defines a map ¢, : T,(M) — Tyip)(N) by b4 1 X > OX 


3.5. Calculus on Manifolds: Differential Forms 590 


3.5 Calculus on Manifolds: Differential Forms 


Having defined vector and tensor fields on a manifold, we now turn to the 
definition of their “dual” objects, differential forms. Both tensor fields and 
differential forms play a crucial role in physics, where we tend to think of their 
roles as rather similar. However in the context of differentiable manifolds the 
two have very different meanings, as we will try to bring out in what follows. 

Given a vector space V with basis e;, 7 = 1,--- ,7, the dual vector space 
V* is defined to be the vector space generated by the dual basis e** given by an 
inner product 


(e",e3) = 6 (3.35) 


For finite-dimensional vector spaces, the dual space is isomorphic to the 
original one. However, the pairing between a vector space and its dual can 
lead to different properties for the elements of the two spaces. These properties 
are induced by a space on its dual if we require that the inner product remain 
invariant under certain transformations. In the present case we want to study 
the dual of the tangent space, and we will derive the transformation laws across 
coordinate patches for elements of this space by requiring that the inner product 
be coordinate invariant. 

In general, clements of V and V* can be expressed in terms of their respec- 
tive bases: 

ae, a=)_,ae; 
! - t 2s t > (3.36) 
a’ EV", a’ =); ae 
Then, (a’,a) = aia/ (e**,e;) = aia’ (summation implied). Thus the pair a’ € 
V*,a€ V gets mapped onto a real number, or in other words, {_}) is a bilinear 
map: V* @V —R. 

There is an alternative way to look at this. For a given a’ € V*, we havea 
linear map 


(a’,): VOR (3.37) 
defined by 
(a’,): a€Vo(@,aeR (3.38) 
Thus the dual vector space to V is the space of linear functionals on V. 
Let us clarify what we mean by linear functionals. A functional is a map 


which takes an element of vector space to a number. In our case, clearly (a, ) 
is a functional, and it has the additional property that: 


{a’,) : Xa + ub — a’, a) + pela’, b) (3.39) 


which is just what we mean by linearity. 
An important property of dual spaces is that if we have a map of vector 
spaces V, W: 
f$:V-oW 


56 Chapter 3. Differentiable Manifolds I 


then this induces a dual map in the opposite direction: 
f* : w* —% y* 


Let us show this. Given b’ € W*, we want to define f*(b’) € V*. We can 
define it as a linear functional on V: 


(F7(b), ds (f* (0), 0) = F(a) (3.40) 


The inner product on the left of the equality is clearly between elements of V and 
V*, while the one on the right is between elements of W and W*. Since the map 
f:V — W is given, the above equation defines the dual map f* : W* — V*. 

In the study of differentiable manifolds, we defined the tangent space T,(M) 
at a point p, with basis 0/Oz‘. Now we will define its dual. 


Definition: The space dual to T,(M) is called the cotangent space T;(M). 


Denote the dual basis of T7(M) by dz*. Note that this is a formal symbol 
for the moment and does not represent an infinitesimal amount of anything! 
Thus there is an inner product, 


(dz', 0/Ox?) = 5, (3.41) 


This is manifestly invariant under local changes of coordinates, if we assign to 
dz’ the transformation law 


. ay, 
as * — —“_dyi 
dx* — dy Axi dz (3.42) 


The notation dz* serves to remind us that this object transforms like an 
infinitesimal distance. As we have indicated above, this is only notation — but 
now we understand what motivates it. 


Definition: An element of the cotangent space is written 
w =w,; dx" 
and is called a 1-form at p. 


The coefficient w; transforms like a “covariant vector”, in “physics” terminology. 


As we did in going from vectors to vector fields, we can generalize a form 
at a single point to a field defined all over the coordinate patch: 


w(x) = w;(x) dr* 


with w;(r) a set of functions on JR”. Such an object, with the coefficients w;(z) 
being differentiable functions, is called a differential 1-form. 

The inner product between basis vectors of the tangent space T,(M) and 
the cotangent space T}(M) gives a product between arbitrary elements of these 


3.5. Calculus on Manifolds: Differential Forms 57 


two spaces. If X = aJ(x) 0/Oz/ is a vector field and w(x) = w;(x)dz* is a 1-form, 
then 


(w, X) = w; a5 (dz? 0/Or4) = wa (3.43) 


This is coordinate-independent, as it should be. The left hand side is a pairing 
between a vector field and a form, each of which is coordinate independent, 
while on the right hand side we find the coefficients (or “components” ) of these 
objects, which are coordinate dependent but transform oppositely across patches 
precisely in such a way that the product is invariant. 

We now define the ezterior derivative on functions on the manifold as a 
map d defined by: 
of 
Oxt 
This associates a 1-form to any function f on a manifold. The components of 
this 1-form are just the derivatives of f along each of the coordinates. Later we 
will see that the exterior derivative can be defined to act on differential forms 
and not just functions. We will provide the complete definition at that stage. 
The exterior derivative plays a fundamental role in the study of differential 
forms. 

If X is an arbitrary vector field, then (as we discussed above) it can be used 
to map a function to another function by f + Xf. Now we have the equality 


d:fodf= 


dx* (3.44) 


Xf = (df, X) | (3.45) 


which relates the action of X on f with the inner product between X and the 
exterior derivative of f. 


Exercise: Check the above equality. 


Just as for the tangent space, we can study tensor products of the cotangent 
space: (T;(M))™ whose basis is dr" ®...dz'm. More generally, we can consider 
elements of the mixed product (T,(M))" ® (T3(M))™, which look like 


Z @...8 —- @dzit @---@dxim, (3.46) 


— Atrevin 
A=A Atm Oet Oxin 


For a physicist, the components can be thought of as forming a “mixed tensor” 
which is nth rank contravariant, mth rank covariant. 

Now let us return to the exterior derivative d. We can think of functions 
as “0-forms”. Then the operator d acting on a 0-form produces a 1-form. How 
about the action of d on l-forms? That is something we have yet to define. 

We could perhaps try something like 


d: w=y;,dz' 4 Se dt @ da 


le? 


But this does not work! Ifw; tranforms like the components of a 1-form, Ow! /Oz? 


58 Chapter 3. Differentiable Manifolds I 


does not transform in a covariant way. 
Exercise: Check the above statement. 


Physicists are familiar with one way to remedy this, which is to introduce 
an “affine connection” and a “covariant derivative”. But that relies on the 
existence of a Riemannian metric on the manifold, something we have not yet 
defined. For now our manifolds are equipped only with a differentiable structure. 
So if we want to define generalizations of 1-forms which transform nicely under 
coordinate changes, we have to do something different. 

Indeed our problem will be solved if we antisymmetrise the derivative. We 
Ow Oud does 


can show that if w; transforms like a covariant vector, then —— — 
Ori = Ox? 


transform like a covariant 2nd-rank tensor: 


dzr* @ oz! 
og \ nO ye a adoaltey — fs ° 
Os ~ = (Sraae) (Fer) — Fora 
dz* Oz! Ox* Oy” Az! 
(So on + SO et) a) 
On antisymmetrising, the second term drops out and we get the desired result. 


The presence of the second term before antisymmetrising shows explicitly that 
the result would not transform like a tensor if we did not antisymmetrise. 


To keep track of the fact that 0jw; must be antisymmetrised, we write 
dw = Oy; dz* A dx 
where dz* Adz! is defined to be (dz @ dx! — dx) @dz*). The object so obtained 


is called an antisymmetric 2-form, or simply a 2-form. 
This motivates us to define the following structure: 


Definition: The wedge product of two copies of T7(M) is the vector space 
spanned by the basis dz‘ A dx). In general, the basis dr! A dx*? A---» A dz'" 
spans the space of totally antisymmetric covariant tensor fields. This is denoted 
A"(M) and its elements are called n-forms. The union of the spaces of all forms, 
A(M) = U, A” (M), is called the exterior algebra of the manifold M. 


We can now complete our definition of the exterior derivative by specifying 
how it acts on any differential n-form, mapping it to an n + 1-form. 


Definition: The exterior derivative on n-forms is a map 
d:a€A"(M) — dae A"™*)(M) (3.48) 
defined by: 
@ = ai..4, (x) dz A--» Adzi* => 


ping 


da = (Feet @)) dz"? Adr™ A--- Adz" (3.49) 


3.6. Properties of Differential Forms 59 


The components of da are totally antisymmetrised, as is manifest in the nota- 
tion. 


Thus we have built up a structure of forms on the manifold, which have 
rank 0,1,---d where d is the dimension of the manifold M (clearly a totally 
antisymmetrised object with rank greater than d must vanish). Along with this, 
we have a map d which takes us from n-forms to n+ 1-forms by antisymmetrised 
differentiation. 


3.6 Properties of Differential Forms 


We list a few properties satisfied by differential forms. They are rather easy 
to prove from the definition and the redder is encouraged to work out all the 
proofs. 

(i) If am € A™(M), bn € A"(M), then am A bn = (—1)™ ba A Gm. This means 
that two different forms anticommute if they are both odd, otherwise they com- 
mute. 

(ii) d(@m Abn) = dan, A bn + (—1)" am A db,. Thus, upto possible signs, the d 
operator distributes over products of forms just like an ordinary derivative. 
(iii) d2 = 0. In other words, acting twice with the d operator on the same 
object gives 0. We say that d is a nilpotent operator. This will turn out to be 
of fundamental importance. The proof of nilpotence goes as follows: 


@a=d (seers addr Adz"! A--+A az‘) 
Oxints 


2 . ; 
= (samcaa tt (x) dx’"+? Adz't*! Adz! A--+>Adz™ 


=0 (3.50) 


since the two derivatives commute, while the wedge product which contracts 
with them anticommutes. 

(iv) The dimension of A"(M) for a d-dimensional manifold is given by the 

d! F : 
d = ————... This is due to the antisymmetry, and 
ni(d—n)! 

is easy to check. Also, the dimension vanishes for n > d as one would expect 
because there are not enough indices to antisymmetrise. 


binomial coefficient 


While studying the tangent space, we saw that ¢: M — N induces a map 
$, : T,(M) — Typ)(N), the differential map between the tangent spaces. It 
turns out that this also induces a map between the cotangent spaces, but in 
the reverse direction. This follows from the dual pairing between tangent and 
cotangent spaces. 

The reverse map is denoted ¢* : Tj, (N) — T;(M) and is defined as 


60 Chapter 3. Differentiable Manifolds I 


follows. If w € T5,,)(N), define d*w € T3(M) by 
(¢*w,X) = (wio.X)  (X €T,(M)) (3.51) 


Differential forms are useful in studying the topology of general differen- 
tiable manifolds. Much of this study can be performed without introducing 
metric properties. It is important to keep in mind that, although a metric on a 
manifold is an essential construction in physics (particularly general relativity), 
it is an additional structure on a manifold. It is worth knowing what are all 
the operations one can perform without having to define a metric, since the 
consequences of these must clearly turn out to be metric-independent. 

In the special theory of relativity, the manifold with which we deal is flat 
Minkowski spacetime (or Euclidean space after Wick rotation). Here differential 
forms play a relatively trivial role, as the topology of Euclidean space is trivial. 
But they do serve as useful notation, and moreover many equations of physics 
generalise immediately to manifolds which are different from Euclidean space, 
and whose topology may be nontrivial. 


Example: Free electrodynamics in 4 spacetime dimensions. The electromag- 
netic field is represented by a 1-form A(x) = A, (x) dz". The field strength is 
simply the 2-form F(x) = dA(r) = $Fy.,(x)dr" A dx” where 

F,,(z) = 0,A,(x) — 0,Ay(z) (3.52) 


We know that d? = 0. Since F = dA, it follows that dF = d?7A =0. What 
does this condition on F physically mean? Writing it out in components, we 
find 


dF = F,,,, dz* Adz" Adz” =0 (3.53) 
By total antisymmetry of the threefold wedge product, we get 
Fur + se + Pyyv =0 (3.54) 


This is called the Bianchi identity. 


If we temporarily get ahead of ourselves and introduce the Minkowski met- 
ric of spacetime, the above equations can be written in a more familiar form. 
First of all we find: 


Fij,n + cyclic = 0 i,j,k € {1,2,3} 
Foij + Fij,0 + Fjo,i = 0 (3.55) 
Next, defining the magnetic and electric fields by 
ee ee 
B= af Fie Ei = —Foi (3.56) 


the Bianchi identity becomes 


ae a (3.57) 
OE; — O;E; = —eiy, 2 -VxE= — 9B 


3.6. Properties of Differential Forms 61 
ee 


So the Bianchi identity, d?7A = dF = 0, is just equivalent to two of Maxwell’s 
equations in empty space. 

It is tempting at this point to display the two remaining Maxwell equations 
in the same notation. For this, we must first define the constant antisymmetric 
tensor (under SO(3, 1) Lorentz transformations) on IR’: it is denoted €,,, dp, and 
takes the value +1 if {u,v, A, 9} is an even permutation of {0,1,2,3} and —1 if 
it is an odd permutation. It is equal to 0 if any two of the indices coincide. We 
can think of this tensor as making up the components of the 4-form 


1 
r Epvrpdz" Adz” A dx A dx? (3.58) 


~ 4 

We can use €yyrp to define what we call the dual of any arbitrary form. 
(We will discuss this concept in more generality in subsequent chapters). For a 
1-form a = a, dx", we define its dual 3-form 


1 
“a= Rr tua dx" Adz” A dx* (3.59) 
by specifying the components as dy). = €yyap a?. 
Similarly, the dual of a 2-form b = $b,v dz" A dz” is the 2-form *b = 
3*by, dx A dx” with components 


1 
“bu > 9 Sure be (3.60) 


It is easy to see similarly that in general, the dual of an n-form is a (4—7)-form, 
and that taking the dual of any form twice gives the original form back again: 
*(*a) =a for any n-form a. 

Now consider *F where F is the Maxwell field strength 2-form defined 
above. Clearly, in general we do not have d*F = 0, since *F (unlike F itself) 
is not d of anything. Nevertheless, let us examine the content of the equation 
d(*F’) = 0, which could be imposed by physical rather than mathematical re- 
quirements. The resulting equations, in component form, will turn out to be 
the other two Maxwell equations. Thus, these equations will be seen to have 
a dynamical content, and they have to be imposed by hand, unlike the two we 
already obtained, which are just mathematical identities. 

Working this out in components we have: 


d(*F) = stuv'p BaF? de” Adz" N dx” (3.61) 
Setting this to zero, and taking the dual of the equation we find: 
Cee eo 0,F?? de® =0 (3.62) 
which after contracting the indices of the two e-symbols gives 


d"F,, =0 (3.63) 


62 Chapter 3. Differentiable Manifolds I 


These are just the following two empty-space Maxwell equations: 


V-E=0, VxB=— (3.64) 


Thus in differential form language, the free Maxwell’s equations are sum- 
marized in: 


dF =0 (identity) 
d*F=0 (dynamical equation) (3.65) 


In realistic physical situations the second equation acquires a term on the right 
hand side representing an electrical charge or current source. 

Finally, note that the fundamental concept of gauge invariance in Max- 
well’s equations has a simple interpretation in terms of differential forms. If we 
change the vector potential by A — A’ = A+dA (where A is a function, or 
0-form), then 

F = F’=dA'=d(A+dA)=dA=F (3.66) 


because d? = 0. This is gauge invariance for free electrodynamics. 


3.7 More About Vectors and Forms 


Herc we will display some more properties of the structures defined above. In 
particular, this will lead to a binary operation, the “Lie bracket”, which maps 
a pair of vector fields to a new vector field. 

Choose a vector field X = a*(x) 0/@z', and a 1-form w = w;(z) dz* Recall 
that X acts on a function f by X : f — Xf to give a new function a’(r) of. 
Also, the inner product between vector fields and 1-forms is a function: 


(w, X) = w(x) a’(x) (3.67) 


Now given two vector fields, we can act with them successively on a func- 
tion, in two different orders, and find the commutator of the two actions. Thus, 
if X = a'(x) 0/Oz', Y = b'(x) 0/Oz', consider 


oh Oe OF 
and, 
. Oo .O 
faXxfoyxya0 ( sa) (3.69) 


Computing the difference of these two operations: 


XYf-YXf= (55-455) aE 
Oxi 


Oxt Ox? (3.70) 


3.7. More About Vectors and Forms 63 


This defines a new vector field: 
Definition: The Lie bracket of two vector fields X and Y is the vector field 
[X,Y]: fo[X,Y]f=xYf-Yxf (3.71) 


[X,Y] is sometimes also denoted L,Y, the Lie derivative of Y along X. In 
components, 


ab). aal 
; Ob) sae) a my 


dee (. Ox? ae ax? } Axi 


The Lie bracket gives a Lie algebra structure to the tangent space. This 
arises from the the following (casily derived) identities: 


[X,Y] = -[¥, X] 
[Xi + Xo, Y] = [X,Y] + [X2, Y] 
[eX, ¥Y] =c[X,Y], ceR (3.73) 
as well as the Jacobi identity: 
{[X, Y], Z] + [[Y, Z], X] + [Z, [X, Y]] =0 (3.74) 


Next, we prove an interesting identity about 2-forms. Since a 1-form maps 
tangent vector fields to real functions, 


w: X—(w,X), (3.75) 
a 2-form will do the same for a pair of vectors: 


Q= 5 (x) dz’ A dx! 


Q: (X,Y) (Q;X,Y) =0;;(z) a(x) (x) (3.76) 
Now if we are given a 1-form w and two vector fields X and Y: 
i i agin 0 
w=u(z)dz*, X =a'(zx) Dat? Y = b'(z) Dat (3.77) 
then dw is a 2-form and we have the identity: 
(dus; X,Y) = F{X(,¥) -¥lu,X)— (LK YD} (3.78) 


The proof follows by expanding both sides: 


LHS = 5 (Ou — Bw) at BF 


= —|q'—(w:, — b’——(w-9?) —w: J =, 
R.H.S..= 5 a FE) ; (w;b’) b a 7 (wja ) Wy ( b ) 


= 5 (Oi — O;w;:) a! b (3.79) 


64 Chapter 3. Differentiable Manifolds I 


The Lie bracket of vector fields appears in the last term in this identity, and 
indeed the identity can be turned around to provide an alternative definition of 
the Lie bracket. 


Chapter 4 


Differentiable Manifolds II 


4.1 Riemannian Geometry 


So far we have studied differentiable manifolds without assigning any metric to 
them. The properties of spaces that we have discussed so far — continuity for 
topological spaces, and differentiability for manifolds - do not require the as- 
signment of a metric. However, manifolds equipped with a notion of distance are 
fundamental in physics, so we now turn to the study of Riemannian manifolds, 
namely manifolds with a metric. 

Since a manifold is only locally like Euclidean space, we will have to start 
by working locally. Moreover we must make sure that whatever distance we 
define between two points will not depend on the specific coordinate systems 
around the points. 

On IR” with Cartesian coordinates! z“, there is a natural notion of distance 
between two infinitesimally separated points x“ and z + dx given by 


ds? = 62"6z" 


We can use this fact to define a distance on a general differentiable manifold 
M. Here, in a given chart with coordinates x“, consider two points p,q whose 
coordinates differ infinitesimally: 


poz’, g—>azh+dah 


Suppose another chart with coordinates y¥ overlaps with the first one. Then 
on the overlap of charts, the displacement dz“ is given in terms of dy by the 
chain rule: 


ic! = a by” (4.1) 


lIn this section we use pp = 1,2,--- ,d to label the directions of a d-dimensional manifold. 
This notation is common in physics. Summation over repeated indices is implied, as before. 


66 Chapter 4. Differentiable Manifolds II 


Clearly 

Ox? Ozh 
re 

Oy* dy? 

so the Cartesian definition of distance is not coordinate-independent. 

However, the above equation tells us how to modify the definition such that 

it becomes coordinate independent. We need to “compensate” the transforma- 

tion of coordinates by introducing a quantity that transforms in the opposite 

way. Therefore we define a rank-2 covariant tensor field g,,(z): 


Guv(z) € T; (M) @ T;(M) (4.3) 

where z is the coordinate of the point p. Given such a tensor, the distance d(p, q) 

between points p and q (infinitesimally separated in terms of their coordinates 
in the given chart) is defined by: 

[d(p, 9)|? = ds? = g,,, (2) da 5x” (4.4) 


This is coordinate independent by virtue of the transformation law of a rank-2 
covariant tensor field: 


Sx" Sa" = ASy? of dytSy" (4.2) 


dx* Ox? 
wlv) = Fox Gee ol) (4.5) 
from which, using the chain rule, it is easy to see that: 
Guv(y) Sy" Sy” = Qu (x) dad” (4.6) 


The tensor g,, must be chosen so that the three axioms defined in Chapter 
1 for a metric on a topological space are satisfied. g,, is called the Riemann 
metric or simply metric tensor. 

The above definition only tells us the distance between infinitesimally sep- 
arated points. To extend it to compute the total length of any path between 
finitely separated points, simply integrate the infinitesimal ds defined above 
along the given path. We will define the distance between finitely-separated 
points p,q to be the minimum of all path lengths between the points: 


d(p,g) = min i ds (4.7) 
Pp 


The integral is evaluated on the image of the path in some chosen chart. 

We may write a Riemannian metric as G = gy,(x) dx" @ dx” where the 
notation ®@ highlights the fact that it is similar to a differential form except that 
it is not antisymmetric. Such an object can be contracted with two vector fields 
using the duality of tangent and cotangent spaces that we have already noted: 


fe) 
iv 
(dz*, an" = oh (4.8) 
Thus, if A = a(x) 2. and B= b# (2) 52 are two vector fields, then 
(GA BY= (Galea 6a? Pa) Ka) 
is P , Ox’ Ox? 


= Gur (xz) a” (x)b" (x) (4.9) 


4,2. Frames 67 


Thus, the metric defines an inner product on the space of vector fields: 
(A, B)g = (G; A, B) = guy ab” (4.10) 


This provides new insight into the meaning of the Riemannian metric. 

Indeed, there is a relation between the two roles of the metric: that of 
providing a distance on the manifold, and that of providing an inner product 
on vector fields. To see this, note that: 


q q 
d(p, q) = min f ds = rin Suv (z)dz# dz” 


pe v 
=min [ ow sa dt (4.11) 


where we have parametrised the image of each curve joining p and q as x#(t) 
with 0 < ts = 1, with +#(0), 2“(1) being the coordinates of p, q respectively. 


Now £2 are just the components of the tangent vector: 
dz* @ 
T = — 4.12 
dt Ox S72) 


to the curve. So 
1 
d(p,q) = min if VT ye at (4.13) 
0 


This expresses the distance between two points along a given curve as an integral 
over the norm of the tangent vector to the curve. 


Definition: A differentiable manifold equipped with a Riemannian metric as 
defined above is said to be a Riemannian manifold. 


4.2 Frames 


It is useful to define a basis for the tangent space at each point of a manifold. 
A continuously varying basis will correspond to a collection of d vector fields on 
M, where n is the dimension of M. 


We can choose the set to be orthonormal in the inner product on T,(M) 
defined by the metric. If the vector fields are denoted 


£.(x) = Ex(2) a, a=1,---,d (4.14) 


then this amounts to requiring that: 


(Ea, Eo)g = uv (x) EP (x) Eg (x) = San (4.15) 


68 Chapter 4. Differentiable Manifolds IJ 


Definition: Vector fields E,(r) satisfying the above requirements are called 
orthonormal frames or vielbeins. Typically they are just referred to as “frames”. 


We may also define the 1-forms dual to these vector fields, These are de- 
noted: 


e*(x) = ef (x) dx” (4.16) 
and are defined to satisfy: 


(e°, Ey) = ef (x) Ef (x) = 6%, (4.17) 


For a d-dimensional manifold, an O(d) rotation of the frames gives a new 
set of orthonormal frames (here O(d) denotes the group of d x d orthogonal 
matrices). The duality between e* and E, is preserved if the same O(d) rotation 
is made on both. 

The frames and their dual 1-forms in turn determine the metric tensor of the 
manifold. From the 1-form e* we can construct the O(d)-invariant symmetric 
second-rank tensor, a e* @ e*. As we have seen, such a tensor defines an 
inner product on tangent vectors. Thus we may compute: 


d 


(} 5 e* @ e%; Ey, Ee) = ef ef Ep EY 
a=1 
= bbe 
= (Bb, Ec)a (4.18) 
from which we conclude that 
nr 
G =g,de" @dz” =) e* @e* (4.19) 
a=1 
or in component notation, 
Qu (x) = ef (z)ep (2). (4.20) 


(the repeated index is summed over). 
Similarly, if we define a rank-2 tensor field }~)_, Ea ® Ea, then we can 
write 3 3 


21 Be @ Ba = hae ap 


(4.21) 


from which it follows easily that 
bY (z)grr(z) = 6% (4.22) 
So h#” is the matrix inverse of g,,. Henceforth we write h#” as g?”. 


Exercise: Check all the manipulations above. 


Once we are equipped with a set of orthonormal frames and their dual 
1-forms, it becomes convenient to study vectors and tensors by referring their 


4.3. Connections, Curvature and Torsion 69 
De es 


components to this basis. For example, given any vector field A(z) = A(z) sor) 
we can take its inner product with the 1-form e®(z): 


A®(z) = (e%(z), A(z) = e2(2) A#(z) (4.23) 


The n-component objects A*(z) are a collection of (scalar) functions on the 
manifold. Under changes of the coordinate system, both ef, (x) and A#(z) change 
such that their inner product remains invariant. 

Similarly, a 1-form B(r) = B,(x)dz" can be converted into a collection of 
(scalar) functions: 


B,(z) = (B(x), Ea{x)) = Ef(2)B,(z) (4.24) 


Although coordinate invariant, the functions A?(r) and Ba(z) do depend 
on the choice of orthonormal frames, and they change under O(n) rotations of 
these frames, for example: 


Ba(z) > A?(z)By(z), ATA=1 (4.25) 


So it may seem we have not gained anything by converting the form into O(n) 
vector-valued functions. Indeed, going from forms to O(n) vectors is completely 
reversible and this is also true if we extend the above relation to map coordinate 
tensors and O(n) tensors to each other via: 


— Fe pee... ee 
Barog-an = Eft EB? + BE" Bus patin 
= ptt et... vet 
Buy yo--pn = Cnr ena °° C0" Baraa-an (4.26) 


This suggests that it is merely a matter of convenience to work with quantities 
that transform under coordinate transformations (tensor fields and forms) as 
against those that transform under O(n). 

However the utility of orthonormal! frames goes beyond convenience. O(n) 
is a semi-simple Lie group and its representations are well-understood. In partic- 
ular (see N. Mukunda’s lectures in this volume) it admits a family of spinor rep- 
resentations. Fields that transform in these representations cannot be mapped 
to (or from) coordinate tensors or differential forms as above. Therefore by in- 
troducing frames and using fields that transform in O(n) representations, we are 
able to deal with a more general class of objects on manifolds. This possibility 
is of crucial importance in the physical context of general relativity, since this 
is precisely how fermions are introduced. 


4.3 Connections, Curvature and Torsion 


Recall that we defined the exterior derivative on 1-forms as* 
d: A= A, dt" dA =0,A, dx” A dx# (4.27) 


2 As usual the components of the forms depend on coordinates, but from this section on- 
wards we suppress the argument to simplify our notation. 


70 Chapter 4. Differentiable Manifolds II 


We have shown that dA obtained in this way is indeed a 2-form. 

Now suppose we first convert the 1-form A into a zero-form using the frames, 
as in the previous section, and then take the exterior derivative. Thus we start 
by defining: 

Aq = (A, Ea) = BHA, (4.28) 


and then attempt to define: 
dAg = 0,Aq dx” (4.29) 


Is the result sensible? It is certainly a differential form. But it is easy to see 
that it fails to be an O(n) vector. In fact, under a local O(n) rotation of the 
frames, 

Ag > AL, =A2Ay, A? € O(n) (4.30) 
one finds that: 

dA, — dA’, = dAPA,+AodA, (4.31) 
Because of the first term, which involves a derivative of A,°, this is not how an 
O(n) vector transforms. 

Therefore we need to look for a new type of derivative D, called a covariant 
derivative, on O(n) vectors. It is required to have the property that, when acting 
on O(n) vectors, it gives us back O(n) vectors. To qualify as a derivative, D 
must be a linear operation, so it is natural to try a definition like: 


(DA)q = dAg +w,°As (4.32) 


Here, 

we = Wy, .° dat (4.33) 
is a 1-form whose O(n) transformation rules remain to be found, and will be 
determined by requiring that the result of differentiation is again an O(n) vector. 


For this, we simultaneously carry out the known O(n) transformation on 
the 1-form and an arbitrary transformation on w: 


Ag > Ay = A? As 
we > wi? (4.34) 
Then 
(DA')q = dA, +0, ° As 


= dhPA, + AedAy +ui,PA,°A- 
=A? (44, +(A7*),°dA 2 Ag + (A7!), ow, ‘AgAe) (4.35) 


Requiring (DA’),g = A,? DA, as expected of an O(n) vector, and comparing 
terms, we find: 
ATdA+A7W/A =u (4.36) 


4.3. Connections, Curvature and Torsion 71 


where matrix multiplication is intended. Hence 
w’ = AwA~!—dAA7} (4.37) 


Thus we have shown that w,° transforms inhomogeneously under O(n). It is 
called a connection (sometimes, in this context, the spin connection). 

We can extract an important property of the spin connection from the above 
transformation law. If we take the transpose and use the fact that A7 = A7?}, 
we find: 


w'? = Awl A} + dA A7! (4.38) 


Adding and subtracting the two equations above, we see that the symmetric part 
of w transforms as a tensor, while the antisymmetric part has the characteristic 
inhomogeneous transformation law?. 

Antisymmetry of w, as demonstrated above, allows us to check that the 
following identity holds: 


d(A, A*) = (DA)q A* + Ag (DA)* (4.39) 


On general O(n) tensors, the definition of the covariant derivative naturally 
extends to: 


(DA) abc = dAab..c + Wet Aabne + wp Angie Se Wet Aabeud (4.40) 


Exercise: Check that the O(n) transformation Agp...c — A.PA,?--- A,” Apg.--r 
together with the transformation for w discovered above, transforms the covari- 
ant derivative (DA)ap...c in the same way as A. 


The spin connection is a differential 1-form but is not itself an O(n) ten- 
sor, as is evident from the inhomogeneous transformation law written above. 
However, using the spin connection we can define two very basic tensors associ- 
ated to a manifold. Applying D to the orthonormal frames e® (which are O(n) 
vector-falued 1-forms), one gets an O(n) vector-valued 2-form. 


Definition: The 2-form: 
T? = De® = de* +w, re? (4.41) 
is called the torsion 2-form on the manifold M. 


Another tensor arises naturally by asking the following question. We have 
seen that in the absence of a spin connection, the exterior derivative d satisfies 


3Note that it makes no difference whether the O(n) indices are raised or lowered, as this is 
done using the identity metric 6,,. In applications to general relativity the group O(n) that 
acts on frames will be replaced by the Lorentz group O(n — 1,1) in which case the metric that 
raises and lowers these indices is the Minkowski metric yg, = diag(—1,1,--- , 1). Here too the 
difference between raised and lowered indices is trivial, involving at most a change of sign. 


72 Chapter 4. Differentiable Manifolds IT 
Sa De rua eI a a hea 


d? = 0. But what about D?, the square of the covariant exterior derivative D? 
This is easily calculated by acting twice with D on an arbitrary O(n) vector A®: 


(D?A)* = (D(D4))" = d(dA* + w4,A®) +w, A (dAe + w4, A’) 
= dw%,A® — w%, AdA? +.w% AdA® + w%, AwyA° 


= (dw, + wt, Aw%,)A° = RA? (4.42) 


Definition: The curvature 2-form is defined by: 
RY, = dw, +w% Aw (4.48) 


which in components may be represented: 


Rr.ge° Net (4.44) 


The torsion and curvature 2-forms are fundamental to the study of Riemannian 
manifolds. 


Exercise: Check the following properties of T° and Rj: 

(DT)* = dT* + w4, AT? = RY Ae? 

(DR), = dR%, +w% AR +w> AR =0 (4.45) 
The second of these relations is called the Bianchi identity. 


Under O(n) rotations, the torsion and curvature transform respectively as 
Te _, AST? 
Re > A°.A,7R°, = (ARA7')4, (4.46) 


Let us now turn to a new type of connection. We have seen that a spin 
connection is needed in order to differentiate O(n) tensors covariantly. Suppose 
instead that we wish to covariantly differentiate a vector field 


0 
A= A*— . 
ae (4.47) 
As before, the ordinary derivative produces something that does not transform 
as a tensor: 3 
on Ho’ — 
dA = 0,A* dz’ @ Agi (4.48) 
So again, we postulate a covariant derivative: 
) 


DA = D,A"dz’ ® — 
Dy Atdz” ® = (4.49) 


4.3. Connections, Curvature and Torsion 73 


such that D,A” transforms as a tensor under changes of coordinates. Requiring 
linearity, we assume 


D,A* = 0, A" +T#, A* (4.50) 


The quantity I, is called the affine connection. In order for DA to trans- 
form covariantly under general coordinate transformations rz“ — z'"(2x), the 
affine connection must transform as ; 

Oz'# Ox az’ Ozr'# =O? x% 
TA (2) = (2) + (4.51) 
¥ Oz” Ox” Ox'® My (e Ox® Ox'” Ox" 
Exercise: Check that with the above transformation law, the covariant deriva- 


tive of a vector field transforms covariantly under general coordinate transfor- 
mations. 


On 1-forms B = B, dz", the covariant derivative is defined similarly, but 
with an important change of sign and different index contractions: 


DB = D, B, dz” @ dz" 
D,B, = OB, —TA,By (4.52) 


This is required so that the contraction of a covariant and contravariant tensor 
behaves as a scalar. 

Note that DB as we have defined it above is not an antisymmetric 2-form, 
but just a general rank-2 covariant tensor. We could, of course, take the anti- 
symmetric part of it to get a 2-form. In that situation, recalling the properties of 
exterior derivatives, one would not need any connection at all. From the above 
equation, we find that the antisymmetrised D acting on a 1-form contains as a 
connection the object Ry —Yr ‘u» Which is called the torsion associated to the 
affine connection‘. It is ‘easy to check that this torsion transforms as a tensor, 
and can be chosen to vanish consistently with the covariance properties of the 
derivative. 

Returning now to the spin connection w, we have observed above that its 
symmetric part transforms as a tensor. Thus it is not really required in order to 
fulfill the main objective of a connection, which is to provide a rule for covariant 
differentiation. Therefore one (standard) way to specify w is to require that it 
be antisymmetric. If in addition we require the torsion (the covariant derivative 
of w) to vanish, it can easily be seen that w is completely determined in terms 
of the frames e*. To see this, let us impose: 


antisymmetry: Wes = —Wba 
no torsion: T* = (De)* =0 (4.53) 


Exercise: Show that the above conditions imply the relation: 
why = GB [(Ouet — dub) — LE” (Beco — Beer) — (a+b) (4.54) 


4This is distinct from the torsion defined above in terms of orthonormal frames. 


74 Chapter 4. Differentiable Manifolds IJ 


The w so defined is sometimes known as the Levi-Civita spin connection. 
Similarly, an affine connection T can be uniquely specified by two condi- 
tions: 
metricity: Dygvrx = O.gvrx — Pagar —Tingue = 0 
no torsion: T'¥,—-Ty, =0 (4.55) 


Metricity is the same as covariant constancy of the metric. 
The affine connection is uniquely determined in terms of the metric by the 
two conditions above: 


1 
LB = 59° (Gana + gJad,v — Qur,0) (4.56) 


This is called the Levi-Civita affine connection, or Christoffel symbol. 


Exercise: Demonstrate that Eq.(4.56) follows from the two conditions in Eq.(4.55) 
above. 


The spin connection and affine connection can be related to each other by 
requiring covariant constancy of the frames: 


Dyes = Ges + wa,er - Tey =0 (4.57) 
This equation clearly determines either connection in terms of the other one. 


Exercise: Solve the above equation for the spin connection in terms of the 
frames and the affine connection. Next, use Eq.(4.56) to express the affine 
connection in terms of the metric and thence in terms of frames using gy, = 
eey. At the end, you will have an expression for the spin connection in terms 
of the frames. Check that this is identical to Eq.(4.54). 


From all the above, it is evident that both the spin connection and the 
affine conncction are essential ingredients in any system where we would like 
to differentiate vector/tensor fields, differential forms, and O(n) tensors on a 
manifold. While there is a certain degree of arbitrariness in their definition, 
there are certain “minimal” conditions which can be imposed to render them 
unique. In applications to physics, it will turn out that these conditions are 
naturally satisfied in the theories of interest. The corresponding connections 
are then dependent variables, being determined by the frames in the case of the 
spin connection and by the metric in the case of the affine connection®. 


4.4 The Volume Form 


Recall that the group SO(n) differs from the orthogonal group O(n) in that 
in the former, the rotation matrices are required to have unit determinant in 


5Some physical applications appear to require an affine connection that has nonzero torsion 
(i.e. is not symmetric in its lower indices). However, since the torsion so defined is a tensor, 
it can always be treated separately from the connection and this choice usually proves more 
convenient in practice. 


4.4. The Volume Form 75 


addition to being orthogonal. Thus, reflection of an odd number of space di- 
mensions, which constitutes an orthogonal transformation disconnected from 
the identity, is not included in SO(n) though it is part of O(n). 

Now for an orientable n-dimensional manifold we should ignore those frame 
rotations that are not continuously connected to the identity, as such rotations 
would reverse the orientation. Therefore we restrict ourselves to SO(n) rotations 
of the frames rather than O(n). Indeed, frames of a given handedness can be 
chosen continuously everywhere on an orientable manifold. 

Consider the n-form: 


A=e! Ne? A+++ Ae” (4.58) 
This is invariant under SO(n) rotations, as the following exercise shows. 


Exercise: Prove that when e* — A%e° with A%, € O(n), \ changes to (det A)X. 
Hence if det A = 1, (equivalently, A € SO(n)) then A is invariant. 


Definition: The n-form \ defined above is called a Riemannian volume form. 


We will see that the volume form plays a crucial role in integration theory on 
manifolds. 


We can rewrite the volume form as 


A\=e! --ef,, aa" Adz? A... Adz' 


Hic pe” 
= (det e) dz! A dz? A--+ Adz” 
= J/g dz! Adz* A+++ Adz” (4.59) 


where g = det gy. 


Another useful way to write it is in terms of the totally antisymmetric 
€-tensor, defined as: 


Epy---n =O (if two indices coincide) 
=+41 (if (#1-+-“n) is an even permutation) 
=-1 (if (f41--- fy) is an odd permutation) (4.60) 


(An even (odd) permutation means that (41---fn) is obtained from (1---n) 
by an even (odd) number of pairwise interchanges.) In terms of this tensor, the 
volume form may be written 


yee 
n! 


T Epi--pn ch? A... AN datn 

One important role of the volume form is that it allows us to define a 
“duality” operation which maps p-forms to (n — p)-forms. This is carried out 
via the Hodge dual operation *, defined by: 


*(dr" A... Adz?) = ee €or gee iar Aaa Nae 


76 Chapter 4. Differentiable Manifolds II 


Equivalently, if A is a p-form: 
Ae Gein de Nice NOE? 
then *A is the (n — p) form satisfying 
AN A= ght" gar. ... ghpveg oy drysvp A 


where A is the volume form. It follows that 


NE] Bisse n 
*A — (n — pl Mr He8 "egrets dghr+1 A---A dx¥ 


Since the above discussion is somewhat abstract, let us give a couple of 
concrete examples. In three dimensions with the Euclidean (identity) metric, 
the volume form is: 

d= dz! A dz? Adz? (4.61) 


The Hodge dual of a 1-form is a 2-form via: 
*dz! = de? A dz? (4.62) 


The familiar “cross product” in vector analysis makes use of this dual. Given 
two 3-vectors 0 and w, one defines two 1-forms: 


v =; dz", w = w; dz? (4.63) 
The wedge product of these vectors is the 2-form: 
vAw = vw; dz Adzi (4.64) 
Finally, we define the Hodge dual of this 2-form, which is a 1-form: 
+(vA w) = €4 uw; dr* (4.65) 
The components of this 1-form are 
(ugw3 — UgWe, V3W1 — V1 W3, ViWe — V2W)) (4.66) 


which are precisely the components of the vector # x wW, familiar as the “cross 
product”. It is unique to 3 dimensions that the wedge product of two vectors 
can again, using the Hodge dual, be expressed as a vector. 


4.5 Isometry 


In preceding chapters, we have defined the criteria that two spaces be identical 
as topological spaces (homeomorphism) and as differentiable manifolds (diffeo- 
morphism). Now we identify the property which makes two spaces equivalent 
as Riemannian manifolds. 


Definition: A map ¢: X — Y, where X and Y are two Riemannian manifolds, 
is an isometry if: 


4.6. Integration of Differential Forms 77 


(i) ¢ is a homeomorphism. 
(ii) ¢ is C© in both directions. 
(iii) The differential map ¢, : T,(X) — T,(Y) preserves the metric. 


The first two requirements amount to saying that @¢ is a diffeomorphism of 
the two manifolds, while the last requirement says that if A and B are any two 
vectors in T,(X), then 

(4A, b+B)g = (A, B)g 

If two Riemannian manifolds admit an isometry between them then they 
are said to be isometric, and are identical in topology, differentiable structure 
and metric. We consider them to be equivalent. 


4.6 Integration of Differential Forms 


We have seen that on an n-dimensional manifold, an n-form has a single inde- 
pendent component by virtue of total antisymmetry in n indices. In fact, the 
dual of an n-form is a 0-form or function. We can represent any n-form as: 
W = Wasa (Z) dr4? A... Adz 
=a(xz)dz' A...Adz” (4.67) 
where 
a(z) = "Pay, (2) 

We would now like to define integration on a Riemannian manifold. For 
this, we assume familiarity with the integral of ordinary functions on Euclidean 
space, which is the same as the integral of a function of many variables over 
some range of its arguments. The more general integration we will now define 
has to be independent of the coordinate charts we use to cover the manifold, 
and invariant under isometries of the metric. As we will see, the natural object 
possessing these properties is the integral of an n-form over an n-dimensional 
manifold. 

We will require the manifold to be orientable. This requirement can be 
guessed at from the fact that an integral of an ordinary function changes sign 
when we reverse the integration limits, or equivalently when we reverse the 
integration measure. The generalisation of this property will be that our integral 
changes sign under reversal of the orientation of the manifold. 

To define the integral of an n-form on an orientable n-dimensional manifold, 
we recast the problem in terms of the usual integration of the function a(z) that 
is Hodge dual to the n-form over suitably subsets of IR”. The latter integral is 
just the usual Riemannian one. Combining the subsets will be done in a way 
that fulfils the requirements spelt out above. 

Thus, we start off with a particular chart (Ma,¢a) on M and define the 
integral of our n-form over this chart as: 


| v= | a(z) dz’ ---dz” 
Ma a (Ma )CR" 


78 Chapter 4. Differentiable Manifolds II 


The right hand side is, as promised, the ordinary integral of a function of several 
variables over some range of these variables. 

Note that a(z) is not a scalar function on the manifold, since w is coordinate- 
independent while: 


“ 
dt} A...Adx™ ody A...Ady® = bo dz! A--- Adz” 
It follows that under coordinate transformations, 
axl 
a(x) > a’(y) = | oe a(z) 


or in words, a(z) transforms by a multiplicative Jacobian factor. Such an object 
is sometimes called a scalar density. 

This transformation of a(z) precisely cancels the variation under coordinate 
transformations of the Cartesian integration measure dz'---dx”. Thus the 
integrand a(z)dz!---dx" is coordinate independent. This explains why we 
started with an n-form, whose Hodge dual is a scalar density, instead of simply 
trying to integrate a scalar function. 

Suppose now we take another chart (M/., Wo), namely the same open set 
M,. but a different homeomorphism 7%, in place of the original ¢,. We then see 
that: 


a’(y) dy! ie dy” = . a(z) dz} ode” 
da (Ma)CR® ba(Ma)CR” 


Thus, f om, On & given open set of M is invariantly defined. 

It only remains to put the contributions from different open sets together 
to get an integral over all of Mf. In order not to encounter divergences in the 
process, we confine ourselves to n-forms with compact support. The support of 
w is defined as: 


Supp(w) = closure of { z € M | w(x) £0 } 


If this set is compact, as defined in Chapter 1, then w is said to have compact 
support. 

Alternatively, instead of worrying about forms with compact support we 
may simply confine our attention to integration over compact manifolds. Since 
every closed subset of a compact manifold is compact (see Chapter 1), we are 
then guaranteed compact support. 

Finally we turn to the actual process of patching up the integral of w over 
various open sets of M. This is achieved by defining a special collection of 
functions on M called a partition of unity. Roughly, we will decompose the 
constant function 1 on the manifold into a sum of functions each having support 
only within a given open set in the cover. This construction will enable us to 
patch together the integrals of a form on the various charts M,, which we have 
defined above, to give an integral over all of M. 


4.6. Integration of Differential Forms 79 


To implement this, first we need a few definitions. 


Definition: An open covering {U.} of M is said to be locally finite if each point 
p € M is contained in a finite number of U,. A space is paracompact if every 
open cover admit a locally finite refinement. (A refinement is defined as follows: 
If {U.} is an open cover, then another open cover {Vg} is a refinement of {Ua} 
if every set Vg is contained in some set Ug.) Evidently, paracompactness is a 
weaker requirement than compactness. 


Now take a paracompact manifold M (incidentally, every metric space is 
paracompact) and a locally finite atlas {U.,¢a}. Pick a family of differen- 
tiable functions eg(z) on M. Each eg is chosen to be nonzero only within the 
corresponding open set U,. The precise requirements are: 

(i) 0 < eg(z) < 1, for each a and all x € Ug. 

(ii) e, are C® functions. 

(iii) eg(z) = 0 if c ¢ Ug. 

(iv) Dy ea(z) = 1. 

Note that the sum in (iv) contains finitely many terms for each z, precisely 
because the cover is locally finite. 

The collection of functions eg is called a partition of unity subordinate ta 
the open cover {U.,¢a}. Given such a partition of unity, we can finally define 
the integral of an n-form over the entire manifold M. 


Definition: The integral of an n-form w over a manifold M is: 


heEh 


In order for this definition to have an intrinsic meaning, it must be inde- 
pendent of the choice of the partition of unity and the atlas. We now prove that 
this is the case. In the process, the meaning of the definition should become 
clearer. 

Given two partitions of unity, eg subordinate to (Ua, da) and e/,,, subordi- 
nate to (UZ,, ¢/,,) we have 


aw dx ---dzx” 
© | eow= ob ee a(t) a(x) dz’ - 
=) q(x) a(x) dz) .--dz” (4.68) 
a R" 


where in the second line we were able to extend the range of integration to all 
of IR” since eg has support only in ¢a(Ua). 


80 Chapter 4. Differentiable Manifolds II 


Now we multiply by 1 = >, €/,, to get 


2X a Caw = (Hee) ay q(x) a(x) dz? --- dx” 
= 2 [ _ eal) ey (2’) a(z) dz) .--dz” 


= >a | (Xeate)) e’.(x’) a(x") dz” --- dx!” 
a’ YR" a 
= Sy. el. (z') a'(z') dx" .--dz'” 
a JR" 
= ), el. (z') a’(x’) dr .-- da!” 
a 9 (U2,) 
= 5. | ey w (4.69) 


In the third step we used the coordinate independence of a(z) dz! .-- dx” while 
in the last step, we used the fact that e/,, has support in Uj,. This proves the 
desired result. 


4.7 Stokes’ Theorem 


For a suitable region in IR”, Stokes’ theorem is a familiar result which relates 
the integral of a divergence over the region, to the integral of the corresponding 
function over the boundary of the region. The generalization to manifolds, which 
we now develop, is an important theorem and will reappear in the context of de 
Rham cohomology which we will study later on. 


Let w be an (n — 1)-form on an n-dimensional manifold: 
= Way paa (2) art! A... Adghen) 


The Hodge dual of w is a 1-form with components which we will denote a”(z). 
Then we can write: 


qT 
w= > (-1)’1a"(2) da A... Adz” A...dz” 


v=} 


(dz” means dz” is deleted). 
Now the exterior derivative of w is an n-form, so its dual will be a 0-form. 


4.7. Stokes’ Theorem 81 


We have: 


n 


his = So Ops tpg finn (ZV da" Ada A. Adal 


pel 
= >> >0(-1)""10, "(a) dt" Adz! A... Ad8" A...Ndz” 
p=ly=1 
= (ha o"(2)) dz} A... Adz” (4.70) 
v 


(The last equation looks non-covariant: in fact, both Q,a” and dz! A--- Adz" 
are not coordinate independent, but the product is, since it is dw). 

Thus we see that the exterior derivative of an (n—1)-form can be expressed 
in terms of the divergence of the components of the dual 1-form. To set up 
Stokes’ theorem we next have to define suitable regions in a manifold. 


Definition: Let M be a differentiable manifold. A subset D of M is called a 
regular domain (or simply domain) if for each p € D (recall that D denotes the 
closure of D), either one of the following holds: 

(i) There exists a local chart (U,#) of M with p € U, such that ¢(DNU) is 
open in IR”, or 

(ii) No such chart exists, but there exists a chart (U,¢) with p € U such that 
o(D 1 u) is an open set of the half-space 


H={ (z',--+,2")€R" | 2" >0} 


Points p € D satisfying (i) are called interior points of D, while points 
satisfying (ii) are called boundary points of D. The reason for this terminology 
should be evident from the definition. 


The set of boundary points of D is denoted OD and defined as 
OD={peEeD|z*=0} 


It is a theorem (whose proof we skip here) that OD is an (n — 1) dimensional 
orientable submanifold of D. 


We now have the necessary ingredients to state, and prove, Stokes’ Theo- 
rem. 


Theorem (Stokes): Let M be an oriented n-dimensional manifold and let 
D be a regular domain in M. Let w be an (n — 1)-form on M with compact 


support. Then 
i dw = | w 
D aD 


(Strictly speaking, w is a form on M and not on OD. But we have an injective 
map, the inclusion 1: 8D — M which maps points in 6D to the corresponding 


82 Chapter 4. Differentiable Manifolds II 


points in M. Then the map 2* which we defined earlier sends forms on A to 
forms on 6D. Thus, by w on the RHS of the above, we really mean 2*w.) 
Proof: Choose a locally finite cover (Ua, ¢a) of M and a partition of unity eg 
subordinate to this cover. Then 


[ead fi eeaw) 


On the RHS, we have a sum over forms, each of which has support in a single 
U.. Thus, it is enough to verify the theorem for such forms, so we take w to 
have support in one particular U,. 

Now we can prove Stokes’ theorem in each of two possible situations: 
Case 1: U.zN OD = ¢ (the open set does not intersect the boundary of D). 


Then: 
/ w=0 
aD 


Now, since U,, does not intersect the boundary of D, it either lies entirely inside 
or entirely outside D. Thus U. C M—D or U. C D. In the former case, 
fp dw = 0 and we are done. In the latter case, 


Lo L—Loaa (BAe) 8 


Since w has compact support, we can take the integration region to be a cube 
in IR", call it C, of side 2A, such that a, = 0 on the border of C (any z# = A). 


Then 
[ae [ (Saerw)ast---ae" 


-> | a’ (a) --- a0") da)». daY +--+ dz” 
yp vc 


Thus we have shown that both sides of Stokes’ theorem vanish in this case, 
hence the theorem is true. 


Case 2: U. NOD # . Then, since OD is defined by x" = 0, we have 


7: w= (—1)""} fonlet,-- ,2"),0) da! ..-dz"-1 
aD 


[parm [ (Saat) ast ae 


where now the cube extends from —) to A in all directions except the n‘” one, 
where it extends from 0 to A. Thus, for any fixed 7 4 n we have: 


ty=X 
=0 (4.71) 


ty=—X 


and 


i. (a,0”)de! ---dz"™ =0 
Cc 


4.8. The Laplacian on Forms 83 


(there is no sum over vy in the above equation), while: 
[ (Onan) dz) ---dx™ = (~1)"7} fasta, -++,2"7] OQ) dz! .--de-! 
Cc 


Thus we have shown that both sides of Stokes’ theorem are equal to the same 
expression. This finally proves the theorem. 


4.8 The Laplacian on Forms 


For a p-form w = wy,...p,(z) dz"! A... Adz», we have already defined the Hodge 


dual: 
ae et hot Wyr--yp(2) date? A... Adxtn 


The Hodge dual enables us to define an inner product on forms as follows: 
(a, 8) = | an *8 
M 


Since a and @ are p-forms, a A *@ is an n-form, hence as we have just seen, 
it makes sense to integrate it over the manifold Af (we assume that M has no 
boundary.) The inner product thus takes two p-forms, for any p, and gives a 
number. 


In components, 
a A *B — Vgg ote gtr’ Opry Buyisiry dz} Aces A dz”. 


so the inner product of the forms is the integral over the inner product of the 
components, with all indices contracted via the metric, and with a weight factor 
/g which ensures coordinate-independence of the result. 

From the above equation it is clear that the inner product is symmetric: 


(a, B) =e (8, a) 


and moreover, the inner product of a form with itself is greater than or equal 
to zero: 


(a,a) >0 


Since the integral of a strictly positive quantity can vanish only if the quantity 
itself vanishes, this can be zero only if the the square of the form components 
is zero at each point, which in turn means the form itself is identically zero. 


Given the exterior derivative, the inner product allows us to define its 
adjoint, denoted 6. This is defined by 


(a, dB) = (da, B). 


Going to components, one finds 


iL aA “df = (-1)"" [ cara) A *6 


84 Chapter 4. Differentiable Manifolds IT 


where the sign depends on the product of n (the dimension of the manifold) and 
p (the dimension of the form). 

Thus, the adjoint operator 6 can be thought of as +*d*. The action of 6 
lowers the dimension of a form by one unit, the opposite of what d does. If w 
is a p-form, then *w is an (n—p) form, d*w is an (n — p+ 1) form and *d*w is 
an (n —(n—p+1)) = (p-—1) form. 


Example: In the particular case of three space dimensions (n = 3), we find 
that the d-operator on 1-forms is the familiar curl: 


; 1 ‘ . 
d: w=w,dz* —~dw= 5 (Oiws — Bjw;) dx’ A dx? 
We can now check that 6 on 1-forms is the divergence: 


. 1 . 
6: w=uw;dz* — —8; w* 4.72 
w Fi (/gw") (4.72) 


Exercise: Derive the above result and show that the RHS is a genuine scalar 
under coordinate transformations. 


Clearly dw defines the generalization of the divergence 0; w* to an arbitrary 
Riemannian manifold in 3 dimensions. 
Just as d? = 0, we also have 6? = 0: 


Sow = £(*d*)(*d*)w 
_ +(*d? *) w 
=0 (4.73) 


A very important second-order differential operator in familiar, flat space 
physics is the Laplacian A = 0; 0;. The generalization to a manifold is defined 
using the d and 6 operators. 


Definition: The Laplacian A is defined by 
A =(d+6)? =di+6d 
The equality of the two expressions is of course due to d? = 6? = 0. 
Clearly the Laplacian (unlike d and 5) maps p-forms back to p-forms: 
A: AP (M) = A?P(M) 


Moreover it is self-adjoint in the norm on p-forms. Thus, for example, it makes 
sense to talk of eigenvectors and eigenvalues of the Laplacian on a manifold. 
The study of these is called harmonic analysis. 

In the special case of Euclidean space IR” and in Cartesian coordinates, the 
Laplacian on any p-form gives: 


Aw = (350; wiz--ip(2)) dz AwetA dx 


4.8. The Laplacian on Forms 85 


Thus in this case it is indeed the usual Laplacian 0;0; on the individual com- 
ponents. The Laplacian is a positive operator, which means its matrix elements 
between any pair cf forms is greater than or equal to zero: 


(w, Aw) = (w, ddw + ddw) 
= (dw, dw) + (dw, dw) 
> 0. (4.74) 


It also follows from this that Aw = 0 if and only if dw = 0 and dw =0 


At this point we introduce some terminology and then state an important 
theorem. 


Definition: A form w is said to be closed if dw = 0, and exact if w = df for 
some other form f. Similarly, w is said to be co-closed if bw = 0 and co-exact 
if w = df for some other form f. Finally, a form satisfying Aw = 0 is said to 
be harmonic. Clearly a form is harmonic if and only if it is both closed and 
co-closed. 


Hodge decomposition theorem: If M is a compact manifold without bound- 
ary, then any p-form w can be uniquely decomposed as the sum of exact, co-exact, 
and harmonic forms: 

w=dat+dp+y7 


for some forms a, G,-y, where Ay = 0. Clearly if w is a p-form, then a, 8, are 
(p — 1),(p + 1) and p forms respectively. 

We omit the proof of this theorem here as it is somewhat involved. Never- 
theless the result will be a useful tool in studying de Rham cohomology in the 
following chapter. 


Chapter 5 


Homology and Cohomology 


5.1 Simplicial Homology 


We have seen in Chapter 2 that the topological properties any topological space 
can be understood to some extent via homotopy, the study of loops in the space. 
An alternative approach to studying topological properties, for a differentiable 
manifold, arises through the study of objects called simplices. This can be 
used to characterize topological properties of manifolds in terms of simplicial 
homology. A closely related methodology, though from a completely different 
starting point, is to characterise topology through the study of differential forms. 
This goes by the name of de Rham cohomology. The two approaches are in a 
certain sense dual to each other, as we will explain. 

We start by developing the theory of simplices and simplicial complexes. 
Intuitively, the idea is to define nice subsets of Euclidean space which look like 
polyhedra. These can be used to cover a manifold by a process called triangula- 
tion. Topological properties of the manifold can then be expressed in terms of 
the pieces which triangulate it. 

To develop some intuition about this process, a simple example is provided 
by taking the 2-sphere S? and drawing triangles all over it to completely cover 
it. Each of the triangles has one edge in common with some other triangle 
(Fig. 5.1) 

The triangles in the figure are triangles only in the sense that they are 
bounded by three lines. These are not “straight lines” in any sense, and the 
answers we extract will not depend on any local details of the lines or triangles. 
The important thing, as we will see in what follows, is that a generalised version 
of triangulation provides a powerful tool for the formulation and solution of 
problems regarding the topological properties of a manifold in any number of 
dimensions. 

For the general case, we first define an object called a simplex. This is 
defined in Euclidean space IR” with no reference to the manifold of interest. If 
21, %2,°** , X41 are distinct points in IR”, they are said to be linearly indepen- 


88 Chapter 5. Homology and Cohomology 


Figure 5.1: A triangulation of $?. 


dent if the M vectors x2 — 41, 43 — 21, ---, M41 —2) are linearly independent 
vectors. A simplex will be obtained by “filling in” the region in IR” defined by 
a set of such points. 


Definition: An M-simplez o™ is the set of points 


M+1 M41 
oMafreR|2= > xa, >> = 1, 420} (5.1) 
i=1 t=1 
where 21,---: , £41 are linearly independent as defined above. 


Here are some examples of simplices: 
(i) A 1-simplex in IR? is the set 


© = 121 + A222 (5.2) 
with A; + Az = 1. Thus, it is the collection of points: 
X= 121 + (1 —Ax)x2 (5.3) 


as A; varies from 0 to 1. Clearly, this is the straight line joining 2 to 2x9. 
(ii) A 2-simplex in IR? is the set 


z= 121 + AnZ2 + A323 (5.4) 
with Ay + A2 + A3 = 1. Thus, 
z=X42; + Aote + (1 —-r,- A2)23 (5.5) 


This is just the interior of the triangle, including the boundary. Linear indepen- 
dence of the points z,, 22,23 ensures that they are not collinear. 


In general, the points {11,-+- ,2a741} are called the vertices of the M- 
simplex o™ , which itself is denoted 


oM= [z1, eae »tM+1] (5.6) 


Note that an M-simplex is not just the set of vertices but the whole region 
“enclosed” by them. What we have defined are more accurately described as 


5.1. “Simplicial Homology 89 


xy x3 


Figure 5.2: The shaded region (including the boundary) is an example of a 
2-simplex. 


“closed ssc It is evident that cach Af-simplex is homeomorphic to A 1j® 
(0, 1] @--- @ [0,1] (M times), or in other words, to a closed subset, of IRS 

The set of numbers A; which label points in the simplex are called barycen- 
tric coordinates. To understand this Semoeey observe that if masses A;, 7 = 

, Mf +1 are placed at the points 2;, (i = 1,--- , 44 +1) then the centre of 

mass is located at z = Re. Ati, if Sop or =. 

For each M-simplex, we can define the faces as the collection of (M — 1)- 
simplices which lie “opposite” the vertices. For example, for a triangle, we can 
associate to each vertex the line that lies opposite it. Thus we have: 


M+1 
Definition: The jth face of an M-simplex o™ is the (M-—1)-simplex = NiXi 
i=l, ij 
M+1 
with > A, = 1. This amounts to considering the subset of the A/-simplex 
i=l, ij 


obtained by setting 4; = 0. Since an Af-simplex has M +1 vertices, it evidently 
also has M + 1 faces. 


Definition: An open simplex is the interior of any closed simplex. We denote 


this by (0). Given independent points x, 22,+++ ,Zas41, an Open simplex is 
M4+1 M+1 
M={ceER"|2= >> dm, yo x= 1, Ai > 0 } (5.7) 
i=1 i=1 


We have simply modified the definition of closed simplex by restricting to A; > 0 
instead of 4; > 0. Clearly an open simplex is an open set in the corresponding 
Euclidean space. 


Finally, we define a simplicial complex as a collection of simplices with 
specific properties. 


Definition: A simplical complez K is a finite collection of (closed) simplices in 
some IR” satisfying: 

(i) If o? € K, then all faces of o? belong to K. 

(ii) If o?, oT € K, then cither (0?) N (a) = ¢, or o? = 0%. 


90 Chapter 5. Homology and Cohomology 


(Recall that by (a?) we mean the open simplex corresponding to the closed 
simplex o?.) The dimension of a simplicial complex K is defined to be the 
dimension of o? € K for the largest p. 


Xy 


xy Vy 


Figure 5.3: A two-dimensional simplical complex which is the union of three 
0-simplices, three 1-simplices and one 2-simplex. 


Basically the first part of the definition says that if a particular simplex 
belongs to a simplicial complex, then that complex must necessarily contain all 
the lower dimensional simplices in the original one. The second part says that 
any two distinct simplices in the complex can only touch along components of 
their faces, but cannot actually overlap each other. 

An example of a 2-dimensional simplicial complex can be defined by its 
collection of simplices as follows: 


K = { [r1, 22,23], [v1, 9], [z2, 23], [z1, x3]; [71], [za], [23] } (5.8) 


This complex is illustrated in Fig. 5.3. 

Some more possible two-dimensional simplicial complexes are illustrated in 
Fig. 5.4, while Fig.5.5 shows some objects that are not simplicial complexes. 
For example, in the first two diagrams of Fig.5.5 a 0-simplex is missing. (In 
these figures, heavy dots indicate 0-simplices.) 


Figure 5.4: Examples of simplicial complexes (check that the axioms hold!). 


We still need a few more definitions before we can use simplicial complexes 


5.1. Simplicial Homology 91 


Figure 5.5: These are not simplicial complexes (0-simplices and 1-simplices are 
missing in some of these diagrams). 


to study the topology of differentiable manifolds. 


Definition: The union of all members of a simplicial complex K with the 
Euclidean subspace topology is called the polyhedron associated with K. (The 
polyhedron will also be denoted K, whenever there is no chance of confusion.) 


Definition: A smooth triangulation of a differentiable manifold M is a home- 
omorphism ¢: K — M for some polyhedron K. (¢ must satisfy a technical 
property that we will not go into here.) 


So far, this has been a rather abstract discussion with no reference to 
manifolds. The key result that relates all this to the topic of our interest is that 
every compact C® manifold can be smoothly triangulated. This is the basis of 
the study of manifolds through simplicial complexes. We will not be able to 
provide a proof of this result here. 

We will necd to refine the definition of a simplex to take into account the 
concept of orientation: 


Definition: An oriented p-simplex (p > 1) is a psimplex along with an order- 
ing for its vertices. The equivalence class of even permutations of the chosen 
ordering defines a positively oriented simplex +o”, while odd permutations give 
the negatively oriented simplex, denoted —c?. (Oriented simplices should be 
denoted by a new symbol, such as (o?), but we avoid it to save notation and 
because henceforth we will always work with oriented simplices). 


For example, a 2-simplex 0? = [Vo, Vi, Va] is associated to the positively 
oriented simplex 


+0? = [Vo,Vi, Va] = [Vi, Va, Vo] = [V2, Vo, Vi] (5.9) 

and the negatively oriented simplex 
—o? = [Vo, Va, Vi] = [Va, Vi, Vo] = (Vi, Vo, Val (5.10) 
Associated to every oriented p-simplex, we would like to define a set of 


oriented (p — 1) simplices called the boundary of o?. As a set of simplices, 
the boundary is just the collection of faces of o?. But we need to assign an 


92 Chapter 5. Homology and Cohomology 


orientation to each face. This can be motivated as follows: if [Vo,Vi] is an 
ordered 1-simplex, then the boundary (denoted by 0) must satisfy: 


al¥o, Vil = -A[Vi, Vol (5.11) 
At the same time, the boundary will naturally contain just the two end points 


of this one-dimensional simplex, namely [Vo] and [Vj]. 
Clearly, the only possibility satisfying these conditions is: 


(Vo. 4] = [Mi] - [Vo] (5.12) 


(or the opposite signs on the right, but that is an overall convention). The sum 
of two oriented 0-simplices on the right hand side is a formal one, and we will 
give it a precise meaning below. 

Generalising this idea to one higher dimension, we see that the boundary 
of an oriented 2-simplex is 


(Vo, Vi, Va] = [Va, Va] — [Vo, Va] + [Vo, Vi] (5.13) 


The rule is evidently to eliminate one vertex at a time and to choose a sign 
corresponding to the location of the deleted vertex. Thus for a general p-simplex, 
we have: 


Definition: The boundary of an oriented p-simplex is 
P . a 
Vo, Vi, aa Vp] = S(-1))M, _— y Vis ~s Vp] (5.14) 
j=0 


where V; mcans that the vertex V; is omitted in that term. 


At this stage we need to define what the formal sums used above really 
mean. Actually, all we will do is to give a name to such sums of simplices, and 
enlarge our notion of simplices to include such formal sums. This will give a 
useful group structure to the study of simplicial complexes. 


Definition: Let K be an n-dimensional simplicial complex, containing lp p- 
simplices. A p-chain of K is a formal sum of oriented simplices with integer 
coefficients. Thus, if of (i =1,--- ,l)) are oriented p-simplices, then 


lp 
Co=) no?, neEZ (5.15) 
i=1 


is a p-chain. So in fact the boundary of a p-simplex, as defined above, is a 
(p ~— 1)-chain. In terms of chains, a simplex can be thought of as the special 
case of a chain where there is only one term in the sum. 


The boundary operator on p-chains is easily defined in terms of the bound- 
ary operator on each simplex in the chain: 


lp bp 
ac, = 8 ( S> net = 5° nj(d0?) (5.16) 
i=1 i=1 


5.1. Simplicial Homology 93 


The collection of p-chains forms an abelian group. The group axioms are estab- 
lished as follows: 


(i) Cp = Vinio?, Dp = Limio? = Cp+ Dp = Vi(ni + mi)o?. 
(ii) The identity chain 0 is obtained by picking all n; = 0. 


a If Cp = D> nio? is a chain then —C, = 0(-n:)o 
+ (—C,) =0. 


a Associativity obviously holds. 


a? is another chain, with 


This is called the free Abelian group generated by the p-simplices of K, de- 
noted C,(K, Z) or simply C,(K). (The Z denotes that we only consider integer 
multiples, which is part of the meaning of free Abelian group). 


Summarising the discussion above, we have defined a map Q, called the 
boundary map, on p-chains. (To be precise, we will label the map 0, when it 
acts on p-chains, though this may secm a bit pedantic. The reader may have 
realised that even for the exterior derivative operator d, which we discussed 
earlier, one could assign a label p when it acts on p-forms. It is necessary to 
stress this point only when confusion might otherwise arise). This map: 


p + Cp(K) > Cp_1(K) 


has the following propertics: 

(i) &, (Lo nia?) = 32 ni(Opo?) 

(ii) Op? = 0,[Vo,Vi,--- Vp] = poh EY VO Vjse++ » Vel 

(iii) O,[Vo] = 0, since a 0-simplex cannot have a boundary. 

(iv) Op is a homomorphism of the Abelian groups C,(K) — Cp-1(K). 


For the last point, one can check for example that 0,(—C,) = —@)(Cp), Op(Cp + 
Dp) = OpCp + O,Dy, and so on. However, it is important to note that this 
homomorphism is not an isomorphism, because in general it is many-to-one. 
For example, two different chains may have a vanishing boundary, in which case 
they are both mapped to 0 € C,_;. More generally, two or more chains can 
have the same boundary. 


As an example, any 2-chain of the form [Vo, Vi] + [Vi, V2] + [V2, Vo] satisfies 
2 (IVo, Vi] + [Vi, Va] + [Vo, Vol) = ([Val - [Val + (IVa] - (Mil) + ([Vol - [¥4]) 
=0 (5.17) 


Now we will prove an important theorem about the boundary operator 0: 
its square is zero, just as for the exterior derivative. 


Theorem: 6? = 0 (this really means 0,-10, = 0 in our pedantic notation). 


94 Chapter 5. Homology and Cohomology 


Proof: The proof is straightforward if slightly tedious to write down. It basi- 
cally relies on a kind of antisymmetry. We have: 


Op-1(Op0") = Op-1 [Z1-niv6 ves Vj, _ a) 


j=0 


Pp 
= SOHO -11Va. Yiu Vol 


j=0 
P fia} ; : 7 
= Hew (Tent. Voss (Vast Ve| 
j=0 i=0 
P ty a A. 
+ y: (—1)'*""[Vo,--- Vis: Visere i) 
i=j+l 
= )-(-1)**7[%, --- Vises 1Vj,°°° Val 
i<j 
+ S0(-1)9-1 [Vo Visee Vise Vp] 
i>j 
= ys (0 + co} [Vo, ook Mi; a Vj Vp) 
i<j 
=0 (5.18) 


The last equality follows since (—1)'*? + (-1)**3-! =0. 


Using these properties of the boundary operator, we can identify two useful 
kinds of chains. 


(i) If OC, = 0 then the chain C, is called a cycle. 
(ii) If Cp = OCp41 for some chain C,41, then Cy, is called a boundary. 


Clearly, if C, = OCp41 then OC, = 0. Thus every boundary is a cycle. How- 
ever, the reverse is not true in general, as we will see. Indeed, if the boundary of 
a chain is zero, that does not necessarily imply that this chain is a boundary of 
some other one. This property turns out to capture a lot of information about 
the manifold which will be triangulated using simplicial complexes. 

We have already noted that the p-chains of K form an Abelian groupC,(K). 
It is easy to see that cycles and boundaries form Abelian subgroups of this group. 
Define 


Z,(K) = { Cp € Cp(K) | OC, = 0 } 
B,(K) = { Cp €C,(K) | Cp = OCp41 } (5.19) 


Thus Z,(K) and B,(K) are, respectively, the groups formed by the cycles and 
the boundaries of K, and clearly, Bp,(K) C Z,(K) C C,(K) (not just as sets 
but also as groups). 


5.1. Simplicial Homology 95 


Now we are in a position to define the concept of a homology group. This 
is again an Abelian group, in which the elements are the cycles up to possible 
addition of boundaries. 


Definition: The p-th homology group H,(K) of the simplicial complex K with 
integer coefficients, is the quotient of the group of cycles by the group of bound- 
aries: 


H,(K) = Z)(K)/B,(K) 


Thus, the elements of H,(K) are the equivalence classes of cycles which differ 
by boundaries. It is obvious that H is itself an Abelian group. Evidently, for 
an n-dimensional simplicial complex K, we have H,(K) = 0 for p > n since 
there are no chains of dimension larger than that of the complex, by definition. 
Thus there is a finite number of homology groups for each manifold, one for 
each integer p between 0 and n. 


Let us make this definition still more explicit. Suppose h € H,(K). Then 
h is an equivalence class of p-cycles: h = [C},C?,C3,---]. Any pair of cycles 
Ci, Ci in this class have the property that their difference is a boundary: Cj — 
Ci = 8Cp41 for some chain Cp41. A different element h’ € Hp(IC) will represent 
a distinct equivalent class: h’ = (D}, D?, D3,---]. If h’ is not the same class as 


h, it means that there is no C,41 such that Ch — D3 = OCp41. 

We can equivalently define H,(A) in terms of an “exact sequence”. 
Definition: If f : G — H is a homomorphism from the group G to the group 
HAY, then: 

(i) The kernel of f, denoted ker(f), is the subgroup of elements of G which get 
mapped to the identity in H: 
ker(f)={zreG| f(r) =0EH } (5.20) 


(ii) The image of f, denoted im(f), is the subgroup of elements of H which 
come from some element of G under the map: 


im(f)={yeH |y=f(z),ceG } (5.21) 


Now clearly, the group Z,(K) of cycles is the same as ker(0,), while the 
group B,(K) of boundaries is the same as im(Qp41). In fact, the boundary 
operator 0 provides a sequence of homomorphisms 


EG a). ECC). ee (5°99) 


This is called an exact sequence, which means that the kernel of the map at 
each stage is contained in the image of the subsequent map. This is obviously 
a consequence of 0-10, = 0 for each p. In this language, the homology groups 
H,(K) defined earlicr are just the quotients of the kernel by the image: 


ker (0p) 


HY Fa (Gai) 


(5.23) 


96 Chapter 5. Homology and Cohomology 


So far, the homology group appears to be a property associated to a given 
simplicial complex (equivalently, polyhedron) A, which specifies a triangulation 
of the manifold. But the following result tells us that it is independent of the 
triangulation. 


Theorem: If two polyhedra K, L are homeomorphic as topological spaces, then 
H,(K) = H,(L). (We omit the proof.) 


Since every compact manifold M can be smoothly triangulated by some 
polyhedron K, we can define H,(M), the homology group of the manifold, to 
be the homology group H,(K) for the polyhedron triangulating it. This is an 
important way to characterize the topology of a manifold. 


Example: Let us find Ho and HA, for the circle, S 1 The circle can be triangu- 
lated as in Fig. 5.6. Thus, 


K ={[Vo,Vi], [Va, Va), (V2. Vo], [Vo], [Vi], [V2] } (5.24) 


Vs 


v, V2 


Figure 5.6: Triangulation of S?. 


Now let us look for the 1-cycles which constitute Z1(A). The most general 
1-chain is: 
Ci = a[Vo, Vi] + BV, Va] + c[Ve, Vo] (5.25) 


where a,b,c are integers. This chain will be a cycle if OC, = 0. This leads to: 
a[V,] — a[Vo] + b[V2} — d[Vi] + c[Vo] — [V2] = 0 (5.26) 
It follows that a = b=c. So Z,(K) is the group of elements: 
Z(K) = { a([Vo, Vil + [Vi, Va] + [Vo Vol) ae Z} (5.27) 


Next, we look for 1-boundaries. By definition, these must bound 2-chains, 


but there are no 2-chains in our complex, since it is 1-dimensional. It follows 
that B,(K) = 0. Thus, 


Ay (K) = 2\(K)/Bi(K) = 21(K) 
= { a([Vo, Vi] + (Vi, Va] + [Va, Vo]) ae Z} (5.28) 


which is isomorphic to the group Z of integers. Thus we can write: 


Hy(S')=Z (5.29) 


5.1. Simplicial Homology 97 


For Ho(K), we start by finding Z)(K). This is defined by 
O(a[Vo] + b[Vi] + c[Va}) =0 (5.30) 


But this is always true since 6[V;] = 0 by definition. (A 0-chain cannot have a 
boundary.) So 


Zo(K) = { a[Vo] + b[Va] + c[Vo] | a,b,c € Z } (5.31) 
which is isomorphic to Z@ Z 9 Z. 
Next we find the 0-boundaries: 
Bo(K) = { Co € Co(K) | Co = OC; } (5.32) 
Take a general 1-chain, as above, then its boundary is 
8 (a[Vo, Vi] + O[Vi, Va] + ¢[V2, Vol) = (2 — b)[Vi] + (6—c) [Va] + (c—a)[Vo] (5.33) 


Thus, an element of Bo(IC) is specified by three integers (a — b), (b—c), (c—a). 
But these are not independent (they sum to zero!). However any two of them 
are independent, so 


B(K)=Zo@Z (5.34) 


The quotient is obtained as follows: a gencral element of Zo(K) is: 
aVo] + [Va] + e[Va]=(a ++ c)[Vo] + ((-b—c)[Vo] + [Mi] + clVa}) 
=(a+b+c)[Vo] (mod Bo(K)) (5.35) 


since the second term, ((-6 —c)[Vo] +5[Yi] +e[Val), is an element of Bo(K) as 
we have shown above. So Ho(K) is labelled by a single integer (2 + b+), and 
hence 

H)(S')=Z (5.36) 


This completes the calculation of the simplicial homology groups for S!. 


The answer that we found for Ho(S") is actually a special case of a general 
result: 


Theorem: If K is a connected polyhedron then Ho(K) = Z. (The proof is 
straightforward.) 


The homology groups defined above were built up using formal sums of sim- 
plices with integer coefficients, so they should actually be denoted H,(M, Z). 
Instead of integer coefficients, we could have considered formal sums of sim- 
plices with real coefficients, which define homology groups which are denoted 
H,(M,1R). These groups are related, but not in a one-to-one fashion, to 
A,(M, Z). 


98 Chapter 5. Homology and Cohomology 


For this, first note that for any manifold, H,(M,Z) must be a direct sum 
of Abelian groups of integers, which means it has the general form 


H,(M,Z)=Z®...Z@®Z,, OZp, @... (5.37) 


where Z is the group of integers under addition, while Z, is the finite group of 
integers under addition modulo p, known as the cyclic group of order p. 

The finite groups Z,, appearing in H,(M,Z) are known as torsion sub- 
groups of the homology group. If we consider now the real, rather than integer, 
cohomology, the result turns out to be: 


H,(M,R)=Re...0R (5.38) 


where the number of copies of IR in H,(M, RR) is the same as the number of copies 
of Z in H,(M, Z). In other words, in the real cohomology, the torsion subgroups 
are ignored, and only the Z factors in the integer homology contribute. 


Definition: The number of Z factors in H,(M,Z), or equivalently the dimen- 
sion of H,(M,R), is called the p-th Betti number of the manifold M, which we 
denote fp. 


The Betti numbers are topological invariants of a manifold, which means 
that any two manifolds that are homeomorphic to each other will necessarily 
have the same Betti numbers. However the converse is not true: having the 
same Betti numbers does not tell us that the two manifolds are homeomorphic. 

The sum of all the Betti numbers with alternating signs is called the Euler 
characteristic x(M) of the manifold: 


dim M 


x(M) = >> (-1)8, (5.39) 


p=0 


The Euler characteristic, like the Betti numbers in terms of which we de- 
fined it, captures something of the topology of a manifold. A very interesting 
theorem tells us that the same topological information can be captured merely 
by counting the simplices in a triangulation of the manifold. 


Theorem: For an n-dimensional simplical complex K, let a, denote the number 
of p-simplices of K. Then 


x(K) = 5°(-1)?ap (5.40) 
p=0 


Thus one can deduce the Buler characteristic of a manifold merely by trian- 
gulating it, without having to actually compute the homology groups! It is not 
the case in general that a, = G, for each p, so that counting simplices does not 
determine the individual Betti numbers. Indeed, the number of simplices in a 


5.1. Simplicial Homology 99 


triangulation is not a topological invariant of the manifold, since there are many 
different ways to triangulate a manifold with various numbers of simplices. The 
theorem above tells us that however we triangulate the manifold, the alternating 
sum of the number of simplices will come out the same and will be a topological 
invariant equal to the Euler characteristic. 


Example: Consider the 2-sphere, $?. This is homeomorphic to a tetrahedron, 
for which we easily find 


ag= 4 
ays 6 
ag=4 (5.41) 
from which it follows that 
x(S?) =4-644=2 (5.42) 


This example is a special case of a famous relation, valid for the surface of 
any polyhedron in 3 dimensional space. Any polyhedron P is homeomorphic 
to S*, so y(P) = 2. Now let V be the number of vertices, E the number of 
edges and F the number of faces of the polyhedron. In principle these numbers 
cannot yet be related to a;, 7 = 0,1,2 since the a’s refer to the simplices in a 
triangulation, while a polyhedron may have faces which are not triangles. Nev- 
ertheless, any polygonal face of a tetrahedron can be subdivided into triangles, 
and it is easy to check that the number V — F + F for the polyhedron is equal 
to G9 — a1 + a2 as obtained after subdividing all faces into triangles. It follows 
that for any polyhedron in 3 dimensions, 


V-E+F=2 (5.43) 


which is a classic result in mathematics known as the Euler relation. 


Figure 5.7: “Triangulation” of a 2-torus (by squares), 


Example: Another convenient case to study is the two-dimensional torus S! @ 
S' = T?. This can be easily “triangulated” by squares, as shown in Fig. 5.7. 
From the discussion above, it is possible to treat the squares as 2-simplices even 
though the latter should really be triangles. Studying the figure (including the 


100 Chapter 5. Homology and Cohomology 


part of it that is not visible!), we find: 


ao = 32 

oy = 244+244+12+44= 64 

a2 =84+8+4+12+4+4= 32 (5.44) 
and hence 

x(T?) = 32 —644+32=0 (5.45) 


So we learn that the 2-torus has vanishing Euler characteristic. 


Exercise: Consider a general 2-dimensional manifold, which has the form of 
a surface with many “handles” (the sphere and torus have 0 and 1 handle 
respectively). By breaking it up into tori and triangulating each torus, show 
that a surface £, with g handles has Euler characteristic 


x(Zy) = 2 — 2g (5.46) 


5.2 De Rham Cohomology 


We now introduce a method to characterise the topology of a manifold in terms 
of the properties of differential forms. Although individual differential forms 
depend on the local properties of a manifold, it turns out that one can define 
a structure very analogous to the homology discussed above, on the space of 
forms, and this reproduces the same topological invariants (the Betti numbers) 
that we obtained through simplicial homology. 

Recall that the ezterior algebra UZ. A? (M) is the space of all p-forms on 
M,0 < p < n (where n is the dimension of the manifold). Now each p-form 
is mapped onto some (p + 1)-form (possibly zero) by the exterior derivative d, 
which satisfies d? = 0. Thus we have a sequence of maps: 


AP-1 (Mt) 25? AP (M) 3 Pt} (iM) 224) (5.47) 


dp-2 
._= 


where d, is the d operator on p-forms, viewed as a group homomorphism between 
the vector spaces A? and A?*?. 


In perfect analogy with the definitions of cycles and boundaries, we now 
define two special kinds of forms: 


(i) If dw, = 0 then the form w, is said to be closed. 
(ii) If w, = dwp_; for some form wy_}, then wp is said to be exact. 


Moreover, the p-forms on M form an Abelian group, indeed a real vector space 
A?(M). Closed and exact forms span subspaces of this vector space, or in other 
words they are Abelian subgroups of this Abelian group. We define 


Z?(M) = { wp € AP(M) | du, = 0 } 
BP(M) = { wp € AP(M) | wp = duryi } (5.48) 


5.2. De Rham Cohomology 101 


Thus Z?(M/) and B?(M) are, respectively, the groups formed by the closed and 
the exact forms on M. All exact forms are closed, by virtue of d? = 0, therefore 
B?(M) c Z?(M) Cc A?(M) (as vector spaces, hence as groups). 

A close analogy is apparent, in structure and in notation, between these 
concepts for the differential forms and the corresponding notions for simplicial 
complexes. Some important differences should be noted. While for simplices 
we started by taking linear combinations with integer coefficients, for forms 
we are forced at the outset to allow linear combinations with real coefficients, 
since forms are intrinsically real-valued. Another point is that simplices were 
themselves an auxiliary construction allowing us to define topological invariants 
for a manifold, while in the case of forms we work directly on the manifold itself. 
An important point about notation is that for differential forms, the label p on 
the groups of closed and exact forms is always written as a superscript, by 
convention, whereas for p-chains it is a subscript. 

After all this, the following definition should come as no surprise. 


Definition: The p-th de Rham cohomology group H?(M) of the manifold M, 
is the quotient of the group of closed forms by the group of exact forms: 


H?(M) = Z?(M)/B?(M) (5.49) 


Thus, the elements of H?(M) are the equivalence classes of closed forms 
which differ by exact forms. Again we clearly have H?(M) = 0 for p > n (where 
n is the dimension of the manifold) since by antisymmetry there are no forms 
of dimension larger than that of the manifold. So there is a finite number of de 
Rham cohomology groups for each manifold, one for each integer p between 0 
and n. 

As we noted above, forms are intrinsically defined with real (rather than 
integer) coefficients, so this is a real cohomology, and each group has the form: 


H?(M)=R®...@9R (5.50) 


Suppose now that the manifold M has been smoothly triangulated. Thus 
there is a homeomorphism between the whole of M and a polyhedron. Pick 
an open p-simplex og? in the polyhedron. This will be homeomorphic to an 
open submanifold of M, so we can integrate a p-form w, over this set (using 
techniques described in the previous chapter) and call it the integral of wp, over 
the simplex o?. Thus we have defined f_, wp. 

This is easily generalized to the integration of forms over chains. Given a 
p-chain C, = 5°; nia}, we define 


[ = om i. “> cor 


An obvious generalization of Stokes’ theorem tells us that 


[w= fi (5.52) 


102 Chapter 5. Homology and Cohomology 


where w is a p-form and C is a (p + 1)-chain. 
In this way we have defined a pairing of forms w and chains C: 


(w,C) = ; w (5.53) 
°e. 
where w € A?(M) and C € C,(K). Stokes theorem says that 


(dw, C) = (w,C). (5.54) 


Thus we have obtained the fundamental result that the boundary operator 0 
and the exterior derivative d are dual to each other. We can use this pairing to 
associate the de Rham cohomology classes and the simplical homology classes 
(with real coefficients) to each other. 

Let [w] be an element of H?(M) (the square bracket is used to denote 
an equivalence class). The class [w] contains some particular closed form w 
satisfying dw = 0, along with all other forms which differ from w by an exact 
form df, where f is an arbitrary (p — 1) form. Thus 


[w] = { ww + dfy,w + df2,... } (5.55) 


where f; are all the elements of A?-1(AMf). This is what a particular de Rham 
cohomology class looks like. 


Next, let [C] be an element of H,(M), namely, a homology class. Thus 
[C] contains a given cycle C with 8C = 0, and all others which differ by the 
addition of boundaries 0¢, where ¢ is any (p + 1)-chain. Thus, 


[C] = { C,C + O61, C + O¢0,... } (5.56) 


where ¢, are all the elements of Cpii(M). 

Now the question is whether the pairing (w,C) = J. w of forms and chains 
induces a pairing [w], [C] of de Rham cohomology classes and simplicial homol- 
ogy classes. In other words, if we define 


(lw), [C}) > (w, C), (5.57) 


where w is any element of [w], and C any element of [C], is this independent 
of the representatives w € |w], C € [C] that we chose? This will be true if and 
only if 
(w + df,C + 8¢) = w,C) (5.58) 
where f is a (p — 1)-form and ¢ is a (p + 1)-chain, both arbitrary. 
The desired result is true by linearity and Stokes’s theorem: 


(w + df,C + A¢) = w,C) + (w, 06) + (df, C) + (df, A¢) 
= (w,C) + (dw, 6) + (f, OC) + (d? f, ¢) 
= (w,C) (5.59) 


5.3. Harmonic Forms and de Rham Cohomology 103 


since dw = 0, OC = 0 and d?f =0. 

Thus we have exhibited the duality between de Rham cohomology and 
(real) simplical homology for smoothly triangulated manifolds. This is the con- 
tent of de Rham’s theorem. (This actually requires many technical points to 


be demonstrated in addition to the above simple calculation, but we omit these 
here.) Thus, 


H,(M, IR) ~ H?(M, IR). (5.60) 


where we recall that the left hand side, with p as a subscript, denotes the pth 
simplicial homology group, while the right hand side, with p as a superscript, 
denotes the pth de Rham cohomology group. 

It may seem a bit surprising that the de Rham groups, defined in terms of 
intrinsically local C® forms, capture only global information about the topology 
of a manifold. In fact the following lemma tells us this must be so. 


Poincare’s lemma: Every closed form is locally exact. In other words, given 
w such that dw = 0, on any finite patch of the manifold we can find f such that 
w = df. 


Therefore the construction of closed forms modulo exact forms contains no lo- 
cal information. Whatever information it does carry must therefore be global, 
concerning the topology of the space. 


Example: If the whole manifold is homeomorphic to an open disc in IR”, 
then it follows that all closed forms are exact, and H?(M) = 0 for p > 0. In 
electrodynamics (which we usually do on IR3), dF = 0 implies F = dA, so there 
always exists a gauge potential. If there is a pointlike magnetic monopole, then 
fields are singular at its location so we must work in the modified space IR? — {0} 
where the location of the monopole has been removed. Now one can still have 
dF = 0, but for F corresponding to a monopole field there is no globally defined 
A for which F = dA. Thus one ends up either with a Dirac string singularity 
(by which the space is changed from R? — {0} to R® — {infinite half-line} and 
then again F = dA, or else we must allow for multi-valued gauge potentials. 
Several physical effects, such as the Bohm-Aharonov effect, follow from similar 
considerations. 


5.3 Harmonic Forms and de Rham 
Cohomology 


We have seen that in de Rham cohomology, an element of H?(M) is an equiva- 
lence class of closed forms which differ by exact forms. We may use the Hodge 
decomposition theorem as a convenience to uniquely specify a particular ele- 
ment of each class. Recall that according to this theorem (valid on compact 
manifolds), 


wy» =dat+dB+7 (5.61) 


104 Chapter 5. Homology and Cohomology 


where Ay, = 0, and this decomposition is unique. Now suppose w is closed, so 
dy =0. Then 


d(da + 66 +) =0 (5.62) 


Since d?a = 0, and also dy = 0 (every harmonic form is necessarily closed), we 
have 


d(5B) =0 (5.63) 


This implies (8,d58) = (58, 68) = 0. which in turn implies 68 = 0. So for a 
closed form w, 
w=dat+y (5.64) 


with Ay = 0. Thus any closed form is equal to a harmonic form plus an exact 
form. 

This gives a convenient way to label any element of H?(M). Among all 
the elements in a cohomology class, oniy one will be harmonic. Then the set of 
classes which form the cohomology group H? is equivalent to the set of distinct 
harmonic p-forms on M, denoted Harm?(M). So we can say that 


H?(M, IR) ~ Harm?(M, IR) (5.65) 


This result is particularly useful for the following reason. The harmonic p-forms 
are just the p-forms which solve Laplace’s equation, 


Aw = 0, (5.66) 


so they span the kernel of the differential operator A. Now A is self-adjoint and 
positive semi-definite, and it is a theorem that such an operator on a compact 
manifold has a finite-dimensional kernel. So there are finitely many solutions to 
the above equation. This shows that Harm?(M]) and hence H?(M) are finite- 
dimensional on a compact manifold. 


Chapter 6 


Fibre Bundles 


6.1 The Concept of a Fibre Bundle 


A fibre bundle is a kind of topological space with additional structure on it. Its 
main property is that locally, but not always globally, it looks like the product of 
two distinct spaces, one of which is to be thought of as a “fibre” above the other, 
which is the “base”. This concept is of great importance in physics, where the 
“base” is usually the spacetime manifold, while the fibre represents dynamical 
variables. Some subtle and important physical phenomena can be conveniently 
represented in the language of fibre bundles, where they appear more natural 
and can be associated with topological properties of the bundle. 

Let us first discuss the concept of product space and product manifold. The 
product of two sets 5S), S2 is the set 


S; @ So ={ (11,22) | 21 € Si, 22 € So } (6.1) 


_If Si and Sq are topological spaces, then let Uy; = {u;}, U2 = {uj} be their 
respective topologies. We may define a topology, called the product topology, on 
5S, ® So: 

Us,@52 = {ui x u;}, uz, € Uy, v7 € Us (6.2) 


It is easy to check that this defines a topology on the set 5S; @ S2. For example, 
the usual topology generated by open intervals on R leads to a product topology 
on JR? = R@ R where the open sets are open rectangles: 


{(z,y)€R?|a<xr<bc<y<d} (6.3) 


This is equivalent to the usual topology of open disks on IR?. 

If the spaces S$; and Sp are also differentiable manifolds, then so is their 
product. To prove this, we construct an atlas as follows. S$) has an atlas 
(Ua, ¢a), while Sp has an atlas (Vg, ga). Then the product manifold S @ S2 
has an atlas (Ue @ Va, fa © Hg) where U, © Vg are the usual products of sets, 


106 Chapter 6. Fibre Bundles 


Figure 6.1: The product topology on R?. 


while dq ® yg are the homeomorphisms: 


ba ® pp : Ua @ Va =) da(Ua) ® va(Vp) (6.4) 


Clearly, if S; and Sq have dimensions m,,me, then S; @ S2 has dimension 
my +m. 

An example of a product space is the cylinder, S! @ [0,1]. Since [0, 1] is a 
manifold with boundary, so is S! @ [0,1]. This space is illustrated in Fig. 6.2. 


ve [0,1] 


Figure 6.2: The cylinder, S' @ (0, 1]. The shaded region is U x [0,1] where U is 
an open interval in S?. 


Consider now the Mébius strip (Fig. 6.3), a well-known object that one can 
construct by folding a strip around and gluing the two ends after giving one of 
them a twist. It is definitely not a product manifold. Yet, if we take some open 
set U in S}, then the shaded region of the Mébius strip does resemble the direct 
product space U @ [0,1]. Thus, we may say that the Mobius strip locally looks 
like a direct product manifold. 

Let us consider another simple example of a space which locally looks like a 
direct product. Take the infinite helix, or spiral, and compare it with the direct 


6.1. The Concept of a Fibre Bundle 107 


U®[0,1] 


Figure 6.3: The MGébius strip, which locally looks like a product space. 


ae 
2 
ae 
Ca 


Figure 6.4: The infinite helix and the space S1 @ Z. 


product space S! @ Z. 


We can define the helix, illustrated in the first picture in Fig. 6.4, as a subset 
of 3-dimensional space. It is parametrised by (cos 27t, sin 27t,t), t € IR. On the 
other hand, the space S! @ Z, illustrated in the second picture in Fig. 6.4, can 
be parametrised as (cos 2zt, sin 2at,n), t € R,n € Z and is quite different — it 
is an infinite collection of circles, labelled by integers. 

Consider now the open sets marked on both diagrams. Clearly they are 
homecomorphic to each other, and each is homeomorphic to the product space 
(open interval) x Z. But the whole helix is not homeomorphic to S$! @ Z! In 
fact, it is a connected space while S! @ Z is not. We will see below that the two 
spaces in Fig.6.4 are distinct fibre bundles, but with the same base space and 
fibre. 

Let us restate the situation above somewhat differently. First of all, the 
circle S' can be thought of as the quotient space IR/Z (the space obtained by 
identifying all sets of real numbers differing by 27n). Thus the second space 
in the figure is homeomorphic to JR/Z) ® Z. On the other hand, the first one 
(the helix) is simply homeomorphic to JR, and this is certainly not the same as 
GR/Z) @ Z, despite the fact that they have homeomorphic subsets. This is why 


108 Chapter 6. Fibre Bundles 


we say the two spaces are locally, but not globally, equivalent. 


Definition: A fibre bundle is a space E which is locally like the product of two 
spaces X and F and possesses the following properties: 

(i) E is a topological space, called the total space. 

(ii) X is a topological space, called the base space, and there is a surjective, 
continuous map 7: E — X called the projection map. 

(iii) F is a topological space, called the fibre. 

(iv) There is a group G of homeomorphisms of F onto itself, called the structure 
group of the fibre bundle. 

(v) There is a collection @ of homeomorphisms: 


ba: 1 (Ug) + Ug @ F (6.5) 
such that if (x, f) is a point of Uz ® F then 
bq '(,f)=2EX (6.6) 


(U. is any open set of the base X). 

We now explain all these properties in some detail, and then illustrate 
them through examples. Let us first discuss the projection map. This simply 
associates to each point on the fibre bundle a corresponding point on the base 
space X. This tells us that every point in the bundle E& “lies above” a unique 
point in the base X. 7 is generally a many-to-one mapping, and the inverse 
image 7~!(x) for a given x € X is the set of all points in E above the point 2. 

Next, the homeomorphisms ¢, give a specific way of regarding the points 
in E “above” the open set U, C X as being points in the product space U, @F. 
This is achieved by homeomorphically mapping 7—1(Uq) to Ua @ F. 

The role of the group G appears when we consider points in X which lie in 
the intersection of two open sets Uy, and Ug. We have the homeomorphisms: 


da: 1 \(Ug NUg) 4 (Ua NUg) ® F 
dp: nm *(Ug NUg) > (Ua NUg) OF (6.7) 


Combining these, we find homeomorphisms: 
ba‘ $3): (UaNUg) @ F > (Ua NUg) @ F (6.8) 
Now for each fixed « € (U.N Ug), these can be regarded as homeomorphisms, 
Jap = ¢a'¢g': FoF (6.9) 


We require that these homeomorphims form a group of transformations of the 
fibre F’; this is called the structure group G. 

If the topological spaces are actually differentiable manifolds, then we get 
(with a few assumptions about differentiability) a differentiable fibre bundle. 


6.1. The Concept of a Fibre Bundle 109 


Let us see how this works in the example of the helix defined earlier. As a 
topological space, the total space & is the real line IR. Chose the base space X 
to be the circle S', and the fibre F to be the set of integers Z. The projection 
map just takes each point on the helix to the corresponding point on the circle: 


a: (cos2zt,sin2nt,t) — (cos2zt,sin2nt) € S’. (6.10) 


This is a surjective, continuous map EZ — X, and (cos 2zt, sin27t) parametrises 
points in S!, with t being a parameter that ranges from 0 to 1 along the circle. 

Given an open interval Ua, = {t|a < t < b} of S!, r71(U,q) is the set of 
all points (cos 27t,sin27t,t + n), where t € U,,n € Z (note that ¢ now has 
a restricted range, unlike the parameter t introduced earlier while defining the 
helix. This is because t now represents a point in an open set of the circle $1). 
Define the homeomorphism: 


ba: T (Ux) + Ua @Z (6.11) 


by 
goa ((cos 2nt, sin 2nt,t + n)) = (cos 2zt, sin 2zt, n) (6.12) 


This explicitly shows that the inverse image, under the projection map, of an 
open set in the base space, is homeomorphic to the direct product of the open 
set with the fibre. 


Figure 6.5: Two overlapping open sets U;, U2 on the circle S$?. 


To complete this example, consider two overlapping open sets of S! defined 
by: 


1 3 
a <t< are 


3 5 
i= - 13 
Ue 4 ex<t< ac (6.13) 


110 Chapter 6. Fibre Bundles 


Then ¢;(Ui M U2), do(U1 MN U2) are given on the upper and lower overlapping 
regions as follows. 


Lower overlap: 3 -e<ti< 3 +e. 


$1 (cos 2rt, sin Int, t + n) = (cos 2zt, sin Qzt, n) 
$o(cos 2nt, sin 27,t + n) = (cos 2rt, sin 2zt, n) (6.14) 


Clearly the homeomorphism @p - ¢,° is just the identity on this overlap. 

The upper overlap is more subtle, since with respect to the patch U, the 
region is | -e <t < } +€, while with respect to U2 itis }—-e<t<$+e. It 
is convenient to make the two ranges coincide, in which case n gets replaced by 
n —1 in the image of $2. Then, 


Upper overlap: } -e<t<}te: 


$1 (cos 2rt, sin 2nt,t + n) = (cos 2zt, sin 2zt, n) 


$2 (cos, 2rt, sin 2rt, t +n) = (cos 27#, sin Int, n — 1) (6.15) 
Thus 
g2-¢': (ULNU2) @Z— (U1 NU2) @Z (6.16) 
is given by 
2+ oy" ((cos 2rt, sin 2xt, n)) — ((cos 2rt, sin at, n — 1)) (6.17) 


For fixed t, this can be thought of as the map 
do-Gp1: ZZ (6.18) 


defined by 
$2 $7 '(n) =n-1 (6.19) 


This map, under composition with itself and its inverse, generates the group 
G = Z of all integers under addition. Thus the structure group of the bundle 
in this example is Z. 

In this way we have identified all the ingredients in our definition of fibre 
bundle, in the simple example of the helix. The one special property of this 
example is that the structure group G and the fibre F are the same, namely, Z. 
We will see later that this example corresponds to a principal bundle. 

We close this section with a rather obvious definition: 


Definition: A fibre bundle with total space E, base space X and fibre F is 
called a trivial bundle if E = X @ F. 


In this case the homeomorphisms ¢; - 1 are all equal to the identity and 
the structure group is trivial. 


6.2. Tangent and Cotangent Bundles 111 


6.2 Tangent and Cotangent Bundles 


In a previous chapter we have defined the tangent space to a manifold M at 
a point p, denoted T,(M). This space has the structure of an n-dimensional 
vector space, where n is the dimension of the manifold. Let us now use it 
to define a particular fibre bundle over the base space X = M. The fibre is 
F =T,(M) ~ R", and the total space is 
E =T(M) = UpemT,(M) (6.20) 

(It is always true that the total space of a bundle is the union of the fibres above 
each point.) This is called the tangent bundle of the manifold M. 

The projection 7: E — X is easy to visualise in this case. It is the map 
a: T(M) — M defined by a(V € T,(M)) = p. So it associates the tangent 
space at a point p of the manifold, with the point p itself. 

Next we must specify a homeomorphism: 


ga: (Ug) + Ua @ IR” (6.21) 


where U, is an open set of M. This is provided by the local coordinates on U,. 
If p € U,, and its coordinates are r'(p) € IR”, then a tangent vector V, € T,(M) 
is an element of 7~'(U,), and it can be represented as 


Vp = at(e(D)) 25 (6.22) 


This defines a map V > (p, ai(x(p))) which is the desired homeomorphism 
from 7~1(Uq) to Ua @ IR”. 
Now consider two different coordinate systems x’ and y’ in patches U, and 
Ug respectively. In the overlap U, U Ug of the two patches, we have the two 
homeomorphisms 
ba : Vp — (p,a°(p)), in coordinates =’ 
ba : Vp — (p,8*(p)), in coordinates y' (6.23) 


and dg - $5! relates a*(p) and b*(p) by 


65-5) :al(p) > bp) = 54 aX(p) (6.24) 


wi] 
= p 


Thus the structure group G consists of all real nonsingular n x n matrices, in 
other words G = GL(n, R). 

One can similarly construct the cotangent bundle UpT>(M), and the tensor 
bundles 


Le lzH(04) @Tp(M)®@...@T3(M) @Ti(M)®... T3(0)| (6.25) 


All these have structure group G = GL(n, IR), although the fibres are different 
in each case. 


112 Chapter 6. Fibre Bundles 


6.3 Vector Bundles and Principal Bundles 


In many of the above examples, the fibre F is a vector space. Such fibre bundles 
are special and have a name: 


Definition: A vector bundle is a fibre bundle whose fibre F is a vector space. 


As one more example related to manifolds, consider the orthonormal frames 
at a point. The bundle obtained by taking the orthonormal frames {e%} at each 
point of M is called the orthonormal (tangent) frame bundle. This is not a vector 
bundle! To specify a point in the fibre, namely a frame, one has to specifiy an 
O(n) rotation relative to some given frame. Thus the fibre is isomorphic to the 
group O(n); it is the group manifold of O(n). The structure group of this bundle 
is also clearly O(n), since across patches one has to specify an O(n) rotation to 
match the fibres. So in this case, the fibre is the same as the structure group. 
This type of bundle is also special and has a name: 


Definition: A principal bundle is a fibre bundle whose fibre F is the structure 
group G itself. 

The example of a helix over S! that we encountered earlier is another principal 
bundle, this time with fibre and structure group isomorphic to Z, the group of 
integers under addition. 


The last few examples have involved base spaces and fibres which are ac- 
tually differentiable manifolds. They are therefore known as differentiable fibre 
bundles. 


Definition: A differentiable fibre bundle is a fibre bundle (E, X, F,7,¢, G) for 
which: 

(i) The base space X is an n-dimensional differentiable manifold. 

(ii) The fibre F is an m-dimensional differentiable manifold. 

(iii) The total space E is an (m+ n)-dim. differentiable manifold. 

(iv) The projection 7: F + X isa C® map, of rank n everywhere. 

(v) ¢ is a collection of diffeomorphisms (rather than just homeomorphisms). 
(vi) G is a Lie group which acts differentiably and effectively. (Thus the map 
g:F > F with g €G isa C™ map.) 

For a C® manifold, the tangent, cotangent and orthonormal frame bundles are 
all differentiable fibre bundles. 


For a trivial fibre bundle (a direct product of the base space with the fibre), 
one could define a function on the base space taking values in the fibre. This 
corresponds to choosing a specific point in the fibre above each point of the base 
space. For non-trivial bundles we can still do this locally, but not necessarily 
globally. 


Definition: A local section (or local cross-section) of a fibre bundle is a contin- 
uous injective map 0: U C X > E such that w- a(x) = <2. 


6.3. Vector Bundles and Principal Bundles 113 
a is 


To define a local section is to continuously pick a point in E “above” every point 
x in a local region of the base space. The fact that the image is “above” x is 
imposed by the requirement that if we send the image back down to X by the 
projection map 7, then it lands on the original point x. Thus a local section is 
precisely a function on an open set of the base space, with values in the fibre. 


Definition: A global section is a continuous injective map ¢ : X — E such 
that m+ o(z) =<. 

This differs from a local section in being defined all over the base space, and 
not just on an open set of it. 


Local sections always exist, but global sections may not. It is easy to 
convince oneself that on the helix there is no global section, but on the Mébius 
strip there is. The latter is illustrated in Fig.6.6. After identifying the sides of 
the rectangle appropriately with the arrows aligned, the wavy line in the figure 
is a global section of the Mébius strip viewed as a bundle over S!. 


| 


[a 


Figure 6.6: A global section of the Mobius strip. 


A global section of the tangent bundle is a continuous C®™ vector field all 
over the manifold, while a global section of the cotangent bundle is a C™ 1-form 
all over the manifold. It is easy to see that such sections always exist. Indeed, 
since the structure group G = GL(n,RR) acts as a linear homogeneous trans- 
formation, the zero vector field is invariant, and one can define it everywhere. 
Similarily the 1-form w = 0 is a global section of the cotangent bundle. 

Whether there exist continuous everywhere non-zero sections of the tangent 
bundle is another story. This depends on the base manifold. For the two sphere 
S?, there is a well-known theorem sometimes stated as follows: “You can’t comb 
the hair on a billiard ball.” If a 2-sphere were represented as a ball and vector 
fields as “hair” attached at each point of the ball, then “combing” the hair would 
amount to flattening this set of vectors of nonzero length to point tangentially 
to the surface. This would be an everywhere nonzero vector field. In fact it is 
easy to convince oneself that this is impossible. There is always some point on 
the ball where the “hair” has no possible direction consistent with continuity. 
In other words, S? admits no continuous, everywhere non-zero vector field. 


Exercise: Try to find a more precise proof that there is no continuous nowhere- 
vanishing vector field on S?. 


One would like to have some criteria to decide whether a given bundle is 


114 Chapter 6. Fibre Bundles 
SES 8 ee EE 


trivial or not, since it is really in the latter case that they are most interesting. 
We will quote a few theorems without proof. 


Theorem: Given a fibre bundle with base space X and structure group G, if 
either X or G is contractible then the bundle is trivial. (The converse is not not 
true! A bundle may be trivial even though X and G are non-contractible.) 


This theorem has a corollary which is useful even when the spaces are not 
completely contractible. This says that we can replace G by any subgroup of 
the same homotopy type, and similarly for X, without changing the topological 
character of the bundle. 

Note that a topological space with the discrete topology is not contractible. 
To be contractible, a space should have the property that the identity map, 
which sends each element to itself, and the constant map, which sends each 
element to a fixed element, are homotopic. This is never true with the discrete 
topology. 

We have not yet arrived at any criterion that can tell us whether a given 
fibre bundle is trivial or not. The following definition will help us find such a 
criterion. 


Definition: Given any bundle E with fibre F and structure group G, we can 
construct the associated principal bundle, denoted P(E), by simply replacing 
the fibre F with the structure group G. 


This is a useful construction because of the following result: 


Theorem: The bundles P(E) and E are both trivial if and only if P(E) admits 
a global section. 


Let us use these theorems to study some fibre bundles. 


Examples: 


(i) The Mobius strip. Here the structure group G is Z2, whose nontrivial element 
acts as a twisting on one end of the strip before it is glued to the other end. 
The associated principal bundle therefore has a fibre that consists of precisely 
two points. Indeed, this bundle is just the boundary of the Mobius strip. 
Clearly the principal bundle has no global section, so the Mobius strip is a 
non-trivial bundle. It should be recalled that the Mébius strip itself certainly 
does admit global sections, but that tells us nothing about its non-triviality 
since it is not a principal bundle. However, by the theorem above, once we have 
shown that the associated principal bundle is nontrivial then we can be sure the 
same result also holds for the Mobius strip. 
(ii) T(S?), the tangent bundle of S?. To show its nontriviality, we will use the 
fact, referred to above, that there is no everywhere non-zero vector field on S?. 
The associated principal bundle to T(M) is the bundle whose fibres are 
GL(n, R). This is called the frame bundle (note that it is not orthonormal!), 
and is denoted F(M). Now an element of GL(n,R) can be specified by a 


6.3. Vector Bundles and Principal Bundles 115 


Figure 6.7: The principal bundle associated to the Mobius strip. 


collection of n linearly independent, non-zero real n-vectors (these just make up 
the columns of the matrix). The existence of a global section of the principal 
bundle associated to T(M) thus reduces to the existence of a collection of n 
independent vector fields which are everywhere non-zero. But no such vector 
field exists on S?, so F(S?) and hence also T(S”) are non-trivial bundles. 

One can contract GL(n, R) to O(n), from which we get the additional result 
(using the corollary to a theorem above) that the orthonormal frame bundle is 
non-trivial on S?. 


Figure 6.8: A non-zero continuous vector field on S!. 


Definition: A manifold M for which the orthonormal frame bundle has a global 
section is said to be parailelisable. 


Thus, the tangent bundle T(M/) to a manifold M is a trivial bundle if and only 
if M is parallelisable. 


Example: S! is parallelisable. In fact we only need one non-zero continuous 
tangent vector field, which is depicted in Fig.6.8. Hence, T(S') is trivial, and 
is in fact the infinite cylinder S! x R. 


We conclude with a result which is certainly not easy to prove, but is quite 
remarkable. 


Theorem: The only compact manifolds which are parallelizable are the man- 
ifolds of all the compact Lie groups and in addition the 7-sphere S’. (Note 


116 Chapter 6. Fibre Bundles 


that among the spheres S”, the only parallelizable ones are $1, S% and S’. The 
first two are Lie group manifolds, corresponding to the group U(1) and SU(2) 
respectively.) 


Bibliography 


{1] I.M. Singer and J.A. Thorpe: ‘Lecture Notes on Elementary Topology and 
Geometry’, Springer (1976). 


[2] T. Eguchi, P.B. Gilkey and A.J. Hanson, ‘Gravitation, Gauge Theories and 
Differential Geometry’, Phys. Rep 66, 213 (1980). 


[3] C. Nash and S. Sen, ‘Topology and Geometry for Physicists’, Academic 
Press (1988). 


[4] M. Nakahara, ‘Geometry, Topology and Physics’, Taylor and Francis (2nd 
edition, 2003). 


Part II: Group Theory and Structure 
and Representations of Compact 
Simple Lie Groups and Algebras 


N. Mukunda 


Indian Institute of Science (retired) 
Bangalore 560 012, India 


+ Jawaharlal Nehru Fellow 


Introduction to Part II 


The aim of these notes is to provide the reader with a concise account of ba- 
sic aspects of groups and group representations, in the form needed for most 
applications to physical problems. While the emphasis is not so much on com- 
pleteness of coverage and mathematical rigour as on gaining familiarity with 
useful techniques, it is hoped that these notes would enable the reader to go 
further on his or her own in any specialised aspects. The main interest will be 
in Lie groups and Lie algebras; compact simple Lie algebras; their classification 
and representation theory; and some specialised topics such as the use of Dynkin 
diagrams, spinors, etc. 

These notes are intentionally written in an informal style, with problems 
and exercises occasionally inserted to help the reader grasp the points being 
made. 


Chapter 7 


Review of Groups and 
Related Structures 


7.1 Definition of a Group 


A group G is a set with elements a, b,...,€,...,g,... endowed with the following 
properties: 


(i) A group composition law is given, so that for any two elements of G 
taken in a definite sequence, a unique product element is determined: 


ab€Gs3c=abeG (7.1) 


In general the sequence is relevant, so ab and ba may be different. 
(ii) This composition is associative, so for any three elements taken in a 
definite sequence, the product is unambiguous: 


a,b,c € G: a(bc) = (abje (7.2) 
(iii) There is a distinguished dient e € G, the identity, which obeys: 
aéG:ae=ea=a (7.3) 
(iv) For each a € G, there is a unique inverse element a~* € G, such that 
a a=aa $e. (7.4) 


These laws defining a group are not expressed here in the most economical 
form. One can show that e is unique; that if one had defined separate left and 
right inverses, they would have been the same; that for each a, its inverse a7} 
is unique; ete. 

Familiar examples of groups are: Sn, the group of permutations of n objects, 
with n! distinct elements; the group SO(3) of proper real orthogonal rotations 


124 Chapter 7. Review of Groups and Related Structures 


in three dimensional space; the groups of Lorentz and of Poincaré transfor- 
mations in three space and one time dimension; the Galilei group relevant to 
non-relativistic mechanics; the discrete group of translations of a crystal lattice. 

Some groups have a finite number of elements (this is then the order of 
the group); others have a denumerable infinity of elements, and yet others a 
continuous infinity of elements which can be parametrised by a finite number of 
real independent continuously varying parameters. 


7.2 Conjugate Elements, Equivalence Classes 


Given any two elements a,b € G, we say they are conjugate to one another if 
there is an element c € G (may be more than one) such that 


b=cac"! (7.5) 


This is an equivalence relation, written as a ~ b, because the three properties 
of such a relation do hold: 


(i) a~a: take ce =e; 
(ii) a~b=> b~a: use c™! in place of ¢; 
(iii) a~bbw~esSanc 


These are, respectively, the identity law, the reflexive and the transitive prop- 
erties. 

The group G thus splits into disjoint conjugacy classes or equivalence 
classes. To the class of a € G belong all elements b € G conjugate to a; 
elements in different classes are non conjugate; two different classes have no 
common elements; the identity e forms a class all by itself. 

As examples, let us mention the following: in S,, all elements with a given 
cycle structure belong to one equivalence class; in SO(3), all rotations by a given 
angle but about all possible axes form one class. 


7.3 Subgroups and Cosets 


A subset H in a given group G,H C G, is a subgroup if its elements obey all 
the laws of a group, it being understood that the composition law is the one 
given in G. Taking advantage of the fact that products and inverses of elements 
in G are already defined, we can express concisely the condition for a subset H 
to be a subgroup: 

hho €H>hy'he eH (7.6) 


(Verify that this condition is necessary and sufficient for H to be a subgroup). 
The cases H = G or H = {e} lead to trivial subgroups. Every other 
subgroup is called a proper or nontrivial subgroup. 


7.4. Invariant (Normal) Subgroups, the Factor Group 125 


Given a subgroup H C G, two (in general different) equivalence relations 
can be set up in G using H: one is called left equivalence with respect to H, and 
the other right equivalence with respect to H. These are to be distinguished from 
one another, and from equivalence defined by the conjugacy property (which 
does not involve any subgroup). For any two elements a,b € G, 


a and B are left equivalent with respect to H  a-'be H 
<> b= ah for some h € H (7.7) 


(Verify that all the laws of an equivalence relation are obeyed). The corre- 
sponding equivalence classes are called “the left cosets of H in G”. The left 
coset containing a € G is a subset written as 


aH = {ah|he H} (7.8) 


A left coset is determined by any of its elements, any two left cosets are 
identical or disjoint, and G is the union of left cosets. The left coset containing 
e is H itself; every other one does not contain e and so is not a subgroup. Right 
equivalence is set up similarly: 


a and b are right equivalent with respect to H + ab"! € H 
#a=hb for some h € H 
(7.9) 


The corresponding equivalence classes are called right cosets of H in G, a typical 
one being 
Ha={ha|he H} (7.10) 


These too are mutually disjoint, etc. 

For a general subgroup H C G: the set of left cosets and the set of right 
cosets are two generally different ways of separating G into disjoint subsets. 
Each coset has the same “number of elements” as H. 


7.4 Invariant (Normal) Subgroups, the Factor 
Group 
If H is a subgroup of G, and g € G is a general element, the set of elements 
H, = {ghg"*|h € H, 9 fixed} = gHg™* (7.11) 


is also a subgroup of G. If g € H, then H, = H; otherwise, in general we would 
expect Hy # H. 

The subgroup H is said to be an invariant, or normal, subgroup if Hs = H 
for every g € G. In other words, 


H is an invariant subgroup © h € H,g €G => ghg' € H. (7.12) 


126 Chapter 7. Review of Groups and Related Structures 


For such a subgroup, the break up of G into left and into right cosets coincide, 
since each left coset is a right coset and conversely: 


H invariant + aH = Ha for eachaeG (7.13) 


These (common) cosets can now be made into the elements of a new group! 
Namely, we define the product of two cosets aH and bH to be the coset con- 
taining ab: 

Coset composition: aH -bH = {ahbh’ | h,h’ € H} 
= {abh|h € H} 
= abH (7.14) 


The group obtained in this way is called the factor group G/H: its elements 
are the H-cosets of G, the identity is H itself, etc. (Verify that since H is 
invariant, the definition (7.14) obeys all the group laws). 

This construction cannot be carried out if H is not invariant. 


7.5 Abelian Groups, Commutator Subgroup 


A group G is said to be abelian, or commutative, if the product of elements does 
not depend on their sequence: 


G abelian : ab = ba for every a,b € G. (7.15) 


If it is not so, we say G is nonabelian or noncommutative. 
The translations in space form an abelian group. The permutation group S,, 
for n > 3, the rotation group SO(3), and the Lorentz group are all nonabelian. 
For a general group G, we can “measure the extent to which two elements 
a and 6 do not commute” by defining their commutator to be another element 


of G: 


q(a, b) = commutator of a and b 
=abab-1 EG (7.16) 


(This notion of the commutator of two group elements is closely related to the 
commutators of operators familiar in quantum mechanics). Clearly we can say: 


q(a,b) =e & ab = ba a and b commute (7.17) 
The “function” g(a, b) has the following obvious properties: 
q(a,b)~* = 9(b, a), 


cg(a, b)c~* = g(cac™, cbe*) (7.18) 


7.6. Solvable, Nilpotent, Semisimple and Simple Groups 127 


These properties allow us to define what can be called the commutator subgroup 
of G, to be denoted as Q(G,G). This notation will become clear later on. We 
define: 
Q(G, G) = products of any numbers of commutators of pairs of 
elements in G 
= collection of elements ¢(@m,bm)q(@m—1, bm—1) -..¢(@2, b2) 
q(a@;,6,) for all m and all choices of a’s and b’s (7.19) 


We see immediately that, using Eq.(7.18) when necessary: 
(i) Q(G, G) is an invariant subgroup of G; 
(ii) the factor group G/Q(G,G) is abelian. 


Property (ii) utilises the fact that, on account of ab = q(a, b)ba, ab and ba are in 
the same coset. One can extend the argument to say: if S C G is an invariant 
subgroup of G such that Q(G,G) is contained in it, i.e., S is “somewhere in 
between Q(G,G) and G”, then G/S too will be abelian. Conversely, one can 
also show easily that if S is an invariant subgroup of G such that G/S is abelian, 
then S must contain Q(G,G). In other words, Q(G, G) is the smallest invariant 
subgroup in G such that the associated factor group is abelian. 

All these properties entitle us to say that the commutator subgroup of 
a given group captures in an intrinsic way “the extent to which the group is 
non-abelian”. In fact, one has trivially the extreme property 


G is abelian <> Q(G,G) = {e} (7.20) 


7.6 Solvable, Nilpotent, Semisimple and 
Simple Groups 
We can use the notions of invariant subgroups and of the commutator sub- 


group to develop several other very useful concepts. Given a group G, form the 
sequence of groups 


G, = Q(G,G) = commutator subgroup of G; 
G2 = Q(G1,G1) = commutator subgroup of Gi; 


Gj41 = Q(G;,G;) = commutator subgroup of G;; (7.21) 
The series of groups 
Go = G,G1,Ge,...,Gj, Gyyi,--- (7.22) 


is then such that each group here is a normal subgroup of the previous one, and 
the corresponding factor group is abelian: 


G;/Gj+41 = abelian, j = 0,1,2,... (7.23) 


128 Chapter 7. Review of Groups and Related Structures 


We say the group G is solvable if for some finite 7,G ; becomes trivial: 
G is solvable < Gy = {e} for some finite n (7.24) 


Solvability is an important generalisation of being abelian. (The name has 
its origin in properties of certain systems of differential equations). If G is 
solvable, there is of course a least value of n for which Eq.(7.24) is satisfied, 
which is then the significant value. If G is abelian, then Eq.(7.24) is obeyed 
already for n = 1, so G is solvable. If G is not solvable, then for no value of n 
(however large!) does Gy, become trivial! 


In contrast to the series of successive commutator subgroups (7.22), another 
interesting series is the following. Given G, we form Q(G,G) to begin with, but 
now write it as G! instead of as G}. Then we define G?,G°,... successively 
thus: 


G? = Q(G,G") = products of any numbers of elements of the form 
q(a,b) fora € G and b€G’', and their inverses; 
G3 = Q(G, G?) = similarly defined but with G? in place of G? 


Gi+! = Q(G,G?) (7.25) 
Each of these is in fact a group, and we have then the series of groups 


Go = G@° =G,G! =G, = Q(G, GC), G? = Q(G,G"),...,G7*) = Q(G, G’) 
(7.26) 
The successive members in this series are not formed by the commutator sub- 
group construction. Nevertheless, it is an instructive exercise to convince oneself 
that for each j, 


(i) G+} is an invariant subgroup of G’, and 
(ii) G7/GI*? is abelian. 


Now the notation Q(G,G) for the commutator subgroup is clear: it is a 
special case of the more general object appearing in Eq.(7.25). 
We say the group G is nilpotent if for some n (and so for some least value 


of n), G” is trivial: 
G is nilpotent + G” = {e} for some finite n (7.27) 
Nilpotency is a stronger property than solvability: 


G is nilpotent > G is solvable, but not conversely (7.28) 


In our later discussions concerning Lie groups, we shall have to deal with 
solvable groups to some extent, but not much with nilpotent ones. 


7.7. Relationships Among Groups 129 


Two other definitions are important: these are of simple and semisimple 
groups. We say: 


G is simple = G has no nontrivial invariant subgroups; 
G is semisimple & G has no nontrivial abelian invariant subgroups 
(7.29) 


The four notions — solvable, nilpotent, simple, semisimple — can be related 
to one another in various ways. On the one hand, as a counterpart to Eq.(7.28), 
we obviously have: 


G is simple > G is semisimple, but not conversely (7.30) 


On the other hand (leaving aside the case of abelian G when G; = G! = {e}), 
if G is solvable, then G, = G! = Q(G,G) must be a proper invariant subgroup 
of G, so G is not simple: 


G is solvable > G is not simple (7.31) 


By the same token, if G is simple, then G; = Q(G,G) must coincide with 
G, so G cannot be solvable: 


G is simple => G is not solvable (7.32) 


So to be simple and to be solvable are mutually exclusive properties; semisim- 
plicity is weaker than the former, and nilpotency is stronger than the latter. Of 
course a given group G need not have any of these four properties, but in a 
qualitative way one can say that nilpotent groups and simple groups are of op- 
posite extreme natures. In a “linear” way we can depict the situation thus: 


Nilpotent = solvable — General group — Semisimple < simple group 


7.7 Relationships Among Groups 


Let G and G’ be two groups, and let us write a,b,...,9,... and a’,b’,...,9’,... 
for their elements. Are there ways in which we can compare G and G’? When 
can we say they are “essentially the same”? When can we say that one of them 
is a “larger version” or a “smaller version” of the other? These are natural 
qualitative questions that come to mind, and we can set up precise concepts 
that help us answer them: homomorphism, isomorphism and automorphism. 

A homomorphism from G to G' (the order is important!) is a mapping 
from G to G’ obeying: 


yp:GoG':a€Gasola)=a' EG; 
y(a)p(b) = (ab) (7.33) 


130 Chapter 7. Review of Groups and Related Structures 


For each a € G, of course y must assign a unique unambiguous image y(a) € G’; 
then multiplication in G “goes over into” multiplication in G’, so the homomor- 
phism property is essentially that 


product of images = image of product. 
The following are easy and immediate consequences: 


e = identity of G : y(e) = e’ = identity of G’; 
y(a)~* = y(a7’) (7.34) 


In general, we must recognise that in a homomorphism, 

(i) more than one element in G may have the same image in G’, ie., 
could be many-to-one; 

(ii) we may not get all of G’ as we allow a in ¢y(a) to run over all of G; so 
y(G) CG". 

We define next another kind of relationship between groups, then return 
to a discussion of homomorphisms. A homomorphism is an isomorphism if the 
map y : G — G’ is one-to-one and onto, hence invertible. In that case we 
have a one-to-one correspondence between elements of G and G’, such that the 
multiplication laws are respected, and we can then really say the two groups 
are “essentially the same” and cannot be distinguished as groups. We say that 
they are isomorphic. 

Now return to the case of a homomorphism y : G —> G’. It is easy to check 
the following: 


(i) o(G) = {p(a)|a € G} = subgroup of G’; 
(ii) the kernel K of the homomorphism, defined as 


K = {a€ Gly(a) =e'} 


is an invariant subgroup of G; 

(iii) The factor group G/K is isomorphic to y(G). 

So in the case of a homomorphism y : G — G’, the two groups play different 
roles and the relationship is non-symmetrical or “one-way”. We can say that G 
is “a larger version” of G’. If for simplicity we assume that y(G) = G’, then 
conversely we can say G’ is “a smaller version” of G. These are only meant to 
be qualitative descriptions intended to help picture a homomorphism. Only if 
y is an isomorphism are G and G’ “essentially the same”. 


An isomorphism in which the two groups are the same, G’ = G, is called an 
automorphism. That is, an automorphism 7 of a group G is a one-to-one, onto, 
hence also invertible, mapping of G onto itself preserving the group composition 
law: 


T:a€G>7(a) EG, 
r(a)r(b) = r(ab), 


7.8. Ways to Combine Groups — Direct and Semidirect Products 131 
ee ees eS 


77} exists (7.35) 


An example of an automorphism is given by conjugation with any fixed element 
of G, say g. We define 74, indexed by g, as 


T (a) = gag for eachacG (7.36) 


It is easy to check that this is indeed an automorphism. Each element a 
stays within its conjugacy class. Such automorphisms are called inner auto- 
morphisms. They form a group on their own too, because 


To! * Ty = Tag (7.37) 


Automorphisms which are not inner are called outer. The set of all auto- 
morphisms of a given group G can be shown to form a group, with the inner 
ones forming an invariant subgroup. 


7.8 Ways to Combine Groups — Direct and 
Semidirect Products 


Let G, and G2 be two groups. Their direct product G, x G2 is a group defined 
in the obvious way with natural group operations: 


(i) Elements of G; x Gz are ordered pairs (a;,a2),a1 € G1, a2 € Go. 
(ii) Pairs are composed by the rule 


(a1, @2)(b1, b2) = (2161, a2b2) 


(iii) The identity and inverses are given as the pair (e1,e2), and (a1,a2)~? 
= (a;',a3') respectively. That we do have a group here, and that Go x G, is 
isomorphic to G, x Ge, are both easy to see. The extension to direct products 
of more factors, such as G, x G2 x G3, is also evident. 

A more intricate way of combining G; and G2 to produce a third group is 
the semidirect product. This requires more structure. Since the two groups in 
the construction play very different roles, we prefer to write H and K rather 
than G; and G2 for them. To form the semi-direct product Hx) K, we need 
for each k € K an automorphism 7; of H such that 


Tk! * Th = Tk'k (7.38) 


In more detail, for each k € K we need an onto, invertible map 7; of H to itself 
obeying all the following: 


Th(h’)7e(h) = Te(A’R) for hh’ € H; 
Te (Tke(R)) =Tee(h) for k,k’e K,he H. (7.39) 


132 Chapter 7. Review of Groups and Related Structures 


Then the group Hx) K is defined as the set of all ordered pairs (h,k) with 
h € H,k © K obeying the law of composition 


(h’, k’)(h, k) = (h' Tee (hh), R’K) (7.40) 


The second factors in the pairs “mind their own business”, while in the compo- 
sition of the first factors drawn from H, the automorphisms 7 play a part. It is 
an instructive exercise to exploit the properties (7.39) and convince oneself that 
the composition law (7.40) is an acceptable one. While the identity of Hx) K 
is the ordered pair (e,e’) where e and e’ are the identity elements in H and K 
respectively, inverses are given by 


(h, k)~* = (t4-1(h7?), k7?) (7.41) 


(here one must use the property 7(a)~+ = 7(a~!) for an automorphism!) 

The semidirect product reduces to the direct product when all the auto- 
morphisms 7 are taken to be the identity. Non-trivial examples are numerous: 
the Euclidean group £(3), the Poincaré group, the space group of a crystal, etc. 
One can picture the semidirect product construction as a kind of inverse to the 
passage to the factor group: indeed, Hx) K contains as subgroups (e, A) and 
(H, e’) respectively isomorphic to K and H; (H,e’) is an invariant subgroup of 
Hx) K, and Hx) K/(H,e’) is isomorphic to K. 

The semidirect product will appear later in the general theory of Lie groups. 


7.9 Topological Groups, Lie Groups, Compact 
Lie Groups 


An introduction to topological and differential geometric concepts is given in 
the accompanying notes in this monograph; we use some of them at this stage. 

A topological group is a set G which is a group and a topological space at 
the same time, and the group operations (composition, taking of inverses) are 
continuous in the sense defined by the topology. 

A Lie group of dimension (or order) ris a topological group G such that G 
is also a smooth manifold of (real) dimension r. This means that G is expressible 
as the union of a certain collection of open sets, each of which is homeomorphic 
to a connected open set of Euclidean real r-dimensional space. Thus, in each 
of these open sets in G, we can assign r independent, real coordinates to each 
group element in a smooth way. 

A compact Lie group of dimension r is an r-dimensional Lie group G such 
that as a topological space G is compact. This means that every open cover of 
G contains a finite subcover. 

If G is not compact, it is non-compact. 

Let ® be an open sct (a neighbourhood) containing the identity in a Lie 
group G. By aM for some a € G we mean the collection of elements 


aN = {aglg ENC GS (7.42) 


7.9. Topological Groups, Lie Groups, Compact Lie Groups 133 


Continuity of group operations ensures that a3t too is open. Then obviously 


G=(aon (7.43) 


acéG 


provides us with an open covering of G. If G is compact, then this must contain 
a finite subcover. So in a compact Lie group G, given any neighbourhood ® 
containing e, we can find a finite number of elements a;,@2,...,@y in G such 
that 

G=a,NUagMNU...UanMN (7.44) 
We would naturally expect N to be larger for smaller N and vice versa. 

A compact Lie group is one for which in an intuitive sense, the volume is 
finite (however one must define volume precisely!) Examples among groups of 
importance in high energy physics are SO(3), SU(2), SU(3) etc. On the other 
hand, the Euclidean, Galilei, Lorentz and Poincaré groups are all non-compact. 
We will later be largely concerned with compact simple Lie groups; to some 
extent, in connection with spinors, we shall also deal with the pseudo-orthogonal 
groups SO(n, 1), SO(p,q) which are noncompact. 


Exercises for Chapter 7 


1. Determine which of the following sets with specified operations are groups, 
whether abelian or non-abelian: 


(i) All complex numbers under addition. 
(ii) All complex numbers under multiplication. 
(ii 


i) 
i) All real three dimensional vectors under addition. 
(iv) 


All real n x n matrices under multiplication. 

(v} All real n x n unimodular matrices under multiplication. 

(vi) All real orthogonal/complex unitary n x n matrices under multipli- 
cation. 


2. For the permutation groups 5S, review the following: 


(i) Various notations for group elements. 
(ii) Composition law, identity, inverses. 
(ii) Cycle structure of a permutation. 
(iv) Equivalence classes as determined by cycle structure. 
(v) Expression of any permutation as a product of transpositions. 


(vi) Show that S2 is abelian, S, for n > 3 is nonabelian. 


134 


Chapter 7. Review of Groups and Related Structures 


. For the abelian group of translations in n real dimensions, consisting of 


translation vectors a = (@1,42,..-,@n), show that for any real non singular 
nxn matrix S, 
aa =Sa 


is an automorphism. Is it inner or outer? 


. The rotation groups in the Euclidean plane, proper and improper, are 


defined as 
SO(2)/O(2) = {A =2 x 2 real matrix |A7A = 1ox2, det A =1/+1} 


Show that SO(2) is abelian, O(2) nonabelian. 


. Show that from a direct product G, x G2 of two groups, the individual 


factors can be recovered as invariant subgroups. 


. The Euclidean group E(3) has elements (R,a) where R € O(3) the real 


orthogonal group, and a is a 3-dimensional real translation vector. The 
composition law is 


(R’,a’)(R,a) = (R’R,a’ + R’a) 


Identify the O(3) subgroup, the translation subgroup T3, and show that 
the latter is normal. Find the corresponding cosets and show that E(3) 
is a semidirect product 73x) O(3). Are O(3) and 73 unique subgroups in 
E(3)? 


Chapter 8 


Review of Group 
Representations 


8.1 Definition of a Representation 


We shall throughout be concerned with linear representations on finite dimen- 
sional real or complex linear vector spaces. Let a group G be given, and let V be 
a (real or complex) linear vector space. We say we have a linear representation 
of G on V if for each g € G, we have a non-singular linear transformation D(g) 
on VY such that: 


(i) D(g')D(g) = D(g'g) for all g, 9’ € G; 
(ii) D(e) = 1, the unit operator on V; 
(iii) D(g)~? = D(g~") for alg EG (8.1) 


The dimension of the representation is the dimension of V; and it is real or 
complex according to the nature of V. The representation is faithful if and only 
if distinct group elements correspond to distinct linear transformations. 

The three conditions above are not stated in a minimal and most econom- 
ical way, but contain some redundancy. This is desirable in the interest of 
explicitness, similar to the way a group was defined in Section 7.1. 

The use of a basis for V allows us to represent each operator D(g) by a 
corresponding matrix. For simplicity of notation the same symbol D(g) will be 
used for both the abstract linear transformation and its matrix representative 
in a definite basis. From the context the meaning will always be clear. 

If we pick a basis {e;},7 = 1,2,..., for V, made up of linearly independent 
vectors, then the matrix associated with D(g) is obtained as follows: 


9 €G: D(g)e; = D(g)azex (8.2) 


Here a summation on the repeated index k is understood, and the positions 
of the indices 7 and k must be carefully noticed. The composition law for the 


136 Chapter 8. Review of Group Representations 


transformations, Eq.(8.1)(i), then translates into matrix form as 


D(q')jxD(9)a = D(9’9)51; (8.3) 


and for the identity element we have the unit matrix: 
D(e) jx = djx (8.4) 


In this way, for each g € G we have an n X n matrix D(g) with elements 
D(qg)jx; it is real if V is real, and may be real or complex if V is complex. These 
representation matrices follow, upon multiplication, the group composition law. 

We are of course free to replace the basis {e;} by another one, {e/} say. 
The effect on the representation matrices is then a similarity transformation. It 
is an elementary exercise to check that this is so: 


eye, = Sij ek => ej = Seize 
D(g)jx + D'(9) jx = S51D(9)imSnks 
i.e. D'(g) = SD(g)S7* (8.5) 


It is of course desirable to think of group representations and their most impor- 
tant properties as far as possible in an intrinsic and basis independent way. It 
is thus that one gets “to the heart of the matter”. However, when expedient, 
and for practical purposes, one need not hesitate to use specific bases, matrices 
etc. In Feynman’s words, this does not by any means involve a sense of defeat! 


8.2 Invariant Subspaces, Reducibility, 
Decomposability 


Let g — D(g) be a linear representation of the group G on the (real or complex) 
linear vector space V. If there is a non-trivial subspace V; C V, neither equal 
to the full space V nor to the zero space 0, such that 


ZEVI,gE€G=> Di(g)t € Vi, (8.6) 


we say the subspace V; is invariant under the given representation, and that 
the representation itself is reducible. If there is no such nontrivial invariant 
subspace, then the representation is irreducible. 

Suppose a given representation is reducible, on account of the existence of 
the non-trivial invariant subspace V; C V. If we can find another nontrivial 
subspace V2 C V, such that 


(i) V=V, Ova, 


(ii) V2 is also an invariant subspace, 


8.2. Invariant Subspaces, Reducibility, Decomposability 137 


then the given representation is decomposable. If such V2 cannot be found, then 
the representation is indecomposable. 
Thus the various possibilities can be depicted diagrammatically thus: 


Irreducible: No nontrivial 


4 invariant subspace 
Representatio 


D(g) of G on VY 


Reducible: There is a non-trivial 
invariant subspace 


WUcv 
Decomposable Indecomposable 
There is another No such V» exists 


non-trivial invariant 
subspace such that 
VHVi18 Ve 


In matrix form, these possibilities can be recognised more graphically. If 
a basis for V is made up of a basis for V; plus some additional basis vectors, 
then reducibility tells us that in such a basis the representation matrices have 
the forms 


Di(g) : B(g) 
Reducible: D(g) = hee’ sae, Sats for allg EG (8.7) 


0 :  De(g) 


where D,(g) and Do(g) are both non-singular, and in fact both form represen- 
tations of G. The indecomposable case is when it is impossible to reduce the off 
diagonal block B(g) to zero; in the decomposable case, we are able to choose the 
additional basis vectors to span the invariant subspace V2, so B(g) does vanish: 


Dig): 0 
Decomposable: D(g) = me eds Meas for allg EG (8.8) 


0 :  De(g) 


Of course, in the irreducible case, even the form (8.7) cannot be achieved for all 
elements g. 

In the reducible decomposable case, the original representation D(-) on V is 
really obtained by “stacking” or putting together the two representations D;(-) 
on V; and De(-) on V2. We call this the process of forming the direct sum of 


138 Chapter 8. Review of Group Representations 


these two representations. Quite generally, let representations D,(-) and De(-) 
on (both real or both complex!) linear vector spaces V, and V2 be given. First 
form the direct sum vector space V = V; @ V2, so that 


réEV>xr=2,4+ 22 uniquely, 
Z,) EV, 22 € Ve (8.9) 


Then the direct sum of the two representations D,(-) and D2(-) is the represen- 
tation D(-) on Y with the action 


D(g)z = Di(g)x1 + Do(g)x2 (8.10) 


The direct sum construction can obviously be extended to three or more sum- 
mands. 

Go back now to a given reducible decomposable representation D(-) on V. 
Now we raise the same question of reducibility for Di(-) and Do(-). If each of 
these is either irreducible or reducible decomposable, we can keep posing the 
same question for their “parts”, and so on. If we can in this way break up 
D(-) into finally irreducible pieces, we then say that D(-) is a fully reducible 
representation. That is to say, a reducible representation D(-) on V is fully 
reducible if we are able to express V as the direct sum of invariant subspaces, 


V=HVi@W@...Vn (8.11) 


and D(-) restricted to each of these is irreducible. For all the groups we shall deal 
with, every representation is either irreducible or fully reducible. In particular 
this is true for all finite groups, and also for any simple group. 


8.3. Equivalence of Representations, Schur’s 
Lemma 


For a given group G, suppose D(-) and D’(-) are two representations on the 
spaces V and V’, both being real or both complex. We say that these represen- 
tations are equivalent if and only if there is a one-to-one, onto, hence invertible 
linear mapping T : V — V’ such that 


D'(g) =TD(g)T™ for allgeG (8.12) 


Clearly, V and V’ must then have the same dimension. In practice, when we 
deal with equivalent representations, the spaces V and V’ are the same. 

The change of basis described in Eq.(8.5) of Section 8.1 leads us from one 
matrix representation to another equivalent one. In the definition of equivalence 
of representations given above, no statement about the reducibility or otherwise 
of D(-) and D’(-) was made. Actually one can easily convince oneself that 
two equivalent representations must be both irreducible, or both reducible; and 
in the latter case both must be decomposable or both indecomposable. For 


8.4. Unitary and Orthogonal Representations 139 


irreducible representations, a very useful test of equivalence exists, and it is 
called Schur’s Lemma. It states: if D(-) and D’(-) are irreducible representations 
of a group G on linear spaces V and V’, and if there is a linear mapping T : 
Y — V’' that intertwines the two representations in the sense 


D'(g)T = TD(g) for all g € G, (8.13) 
then either 
(i) D(-) and D’(-) are inequivalent, and T = 0, or 
(ii) D(-) and D’(-) are equivalent, and in case T # 0, then T~? exists. 


This is a very nice result, which we can exploit in more than one way. On 
the one hand, if we know that D(-) and D’(-) are inequivalent, then we are 
sure that any intertwining T’ we may succecd in constructing must be zero. On 
the other hand, if we know that T is non-zero, then in fact T will also be non- 
singular, and the representations must be equivalent. With both representations 
given to be irreducible, it can never happen that the intertwining operator T' 
is non-zero and singular! The proof (which we do not give here) is a clever 
exploitation of the range and null spaces of T, respectively subspaces of V’ and 
Y , and the assumed irreducibility of D(-) and D’(-). 

As a special case of this general result, we can take D’(-) = D(-), so we 
have just one irreducible representation on a given space Y. Then one easily 
finds the result: 


D(-) irreducible, D(g)T = TD(g) for all g E G 
=> T=X1= multiple of the unit operator on V. (8.14) 


8.4 Unitary and Orthogonal Representations 


We now consider spaces V carrying an inner product, and representations D(-) 
preserving this product. First we take the case of a complex linear space V, 
later the real case. 

Denote by (x,y) the hermitian nondegenerate positive definite inner prod- 
uct on the complex vector space Y. As is conventional in quantum mechanics, 
we assume it to be linear in y and antilinear in x. A representation D(-) of G 
on V being given, we say it is unitary if it respects the inner product, i.e., 


(D(9)x, D(g)y) = (,y) for all g E Giz,yeV 
ie, (x, D(g)'D(g)y) = (a, y) for allg € Giz,yeV 
i.e, D(g)'D(g) =1= unit operator on V (8.15) 
The dagger denotes of course the hermitian adjoint determined by the inner 


product (. , .). If we use an orthonormal basis for Y, then the matrices (D(g) jx) 
will be unitary matrices. 


140 Chapter 8. Review of Group Representations 


Next, consider a real vector space V carrying a symmetric nondegenerate 
positive definite inner product, which for simplicity is again denoted by (z, y). 
By introducing an orthonormal basis, and associating components with the 
vectors x and y, we can put the inner product into the usual form 


(x,y) =a? y (8.16) 


If the representation D(-) of G on V preserves this inner product, we say it is 
real orthogonal. ‘The corresponding matrices then obey, in addition to being 
real, 


D(g)? D(g) = 1 (8.17) 


For finite groups, as well as for compact Lie groups, it is true that any 
representation is equivalent to a unitary one; we can say any representation 
is essentially unitary. If for a group, every representation is both essentially 
unitary and fully reducible, then the basic building blocks for its representation 
theory are unitary irreducible representations, often abbreviated as UIR’s. This 
is again so for finite groups and for compact simple Lie groups. 

Quite often, a complex vector space Y and a representation D(-) of G on 
VY may be given, and we may search for an acceptable inner product on VY 
under which D(-) is unitary. If no such inner product can be found, then we 
have a non-unitary representation. Physically important examples are all the 
finite dimensional nontrivial representations of the Lorentz group SQ(3, 1), of 
its covering group SL(2, C), and of the higher dimensional groups SO(p, q) of 
pseudo-orthogonal type. 

It can very well happen that in a suitable basis for a complex vector space V, 
the matrices D(-) of a representation all become real. This motivates questions 
like: when can a unitary representation be brought to real orthogonal form? 
We shall describe the answer after defining some common ways of passing from 
a given representation to related ones. 


8.5 Contragredient, Adjoint and Complex 
Conjugate Representations 


Let D(-) be an irreducible representation of G on V; for definiteness choose some 
basis in V, and work with the corresponding matrices of the representation. We 
can take V to be a complex space, but do not assume any inner product on 
it, or that D(-) is unitary. (The alternative to using a basis and representative 
matrices would be to work with the dual to V). From this given irreducible 
Tepresentation, by simple matrix operations three other representations can be 


8.5. Contragredient, Adjoint and Complex Conjugate Representations 141 


constructed: 


g — D(g) is an irreducible matrix representation > 
g — (D(g)7)~? is the irreducible representation contragredient to D(-); 
g — (D(g)')~? is the irreducible representation adjoint to D(-); 
g — D(g)* is the irreducible representation complex conjugate to D(-) 
(8.18) 
Since in this general situation, neither reality nor unitarity nor orthogonality of 
D{-) was assumed, each of these derived representations could be inequivalent 
to D(-) itself. If, however, V carried a hermitian inner product and D(-) was 
unitary, then not all these representations are different: 
D(-) unitary = D(-) is self-adjoint, 
Contragredient = complex conjugate (8.19) 


Similarly, for a real space V, D(-) is by definition real. In addition, 
D(-) orthogonal => D(-) is self-contragredient (8.20) 


In general terms, the property of being self-contragredient has interesting 
consequences. Take an irreducible matrix representation D(-) of G, whether 
real or complex, and suppose it is equivalent to its contragredient. This means 
that there is a non-singular matrix S such that 


(D(g)7)~! = SD(g)S“! for all g EG (8.21) 
By exploiting this relation twice, we quickly arrive at 
S187 D(g) = D(g)S~!S? for allg EG (8.22) 
Then Schur’s Lemma in the form (8.14) implies that for some constant c, 
S-'1§T =¢-1, 
i.e. ST=CcS, 
ie. S=c?S, 
ie. c=+1,87 =+8 (8.23) 


Thus, the similarity transformation relating D(-) and (D(-)7)~? is definitely 
either symmetric or antisymmetric, and the latter case can only arise if the 
dimension of the representation is even. Let us now write the equivalence (8.21) 
in the form 

D(g)7 SD(g) = S for allg EG (8.24) 


This means that if we define a non-degenerate bi-linear form (z,y) on V using 
the matrix S as 


(z,y) = 27 Sy, (8.25) 


142 Chapter 8. Review of Group Representations 


then the representation D(-) preserves this form: 


(D(g)z, D(g)y) = 27 D(g)” SD(g)y = (z,y) (8.26) 


Thus, the characteristic of a self contragredient representation is that it pre- 
serves a symmetric or an antisymmetric nondegenerate bi-linear form (there is 
no statement however of positive definiteness of this form!). Conversely, invari- 
ance of such a bi-linear form evidently means that D(-) is self contragredient. 

As a last point in this section, we analyse the possible ways in which an 
irreducible representation D(-) can be related to its complex conjugate D(-)*. 
We will deal only with the case where D(-) is given to be unitary as well. Even 
so, the situation shows some subtleties, and it is worth examining these in some 
detail. 

We are given, then, a UIR D(-) of a group G on a complex linear vector 
space V equipped with a hermitian positive definite inner product. By choosing 
an orthonormal basis, we have unitary representation matrices D(g). Let the 
dimension of VY be n. There are three mutually exclusive possibilities for the 
relationship between D(-) and D(-)*: 


(i) Complex case: D(g) and D(g)* are inequivalent UIR’s. 


(ii) Potentially real case: D(g) and D(g)* are equivalent, and one can choose 
an orthonormal basis for V such that D(g) becomes real for all g € G. 


(iii) Pseudo real case: D(g) and D(g)* are equivalent, but they cannot be 
brought to real form. 


What we need to do is to characterise cases (ii) and (iii) in a useful and 
interesting way. Since we have assumed unitarity, in both cases we have a 
self-contragredient representation, so according to the discussion earlier in this 
section there is a symmetric or antisymmetric matrix S transforming D*(-) into 


D(-): 
D(g)" = SD(g)S~" 
ST=+S (8.27) 
We will in fact find that 
Case (ii), D(g) potentially real @ ST = S, 
Case (iii), D(g) pseudo real <& S7 = —S, so n must be even (8.28) 


Case (iii) is simpler to handle than case (ii), so we dispose of it first. Maintaining 
the unitarity of the representation, the most general change we can make is 
a unitary transformation, which changes S in such a way as to maintain its 
antisymmetry (as we would have expected in any event): 


D'(g) =UD(g)U-},UtU =1> 
D'(g)* = S’D'(g)(S’)~* 
$'=(UT)"'su-' = 
(S')? = — ' (8.29) 


8.5. Contragredient, Adjoint and Complex Conjugate Representations 143 


Thus, no choice of U can replace S by the unit matrix or a multiple thereof, 
which by Schur’s Lemma is what would have to be done if D’(g) were to become 
explicitly real. We have therefore shown that 


ST = —S < D(-) cannot be brought to real form (8.30) 


Case (ii) is a little more intricate. Now, with no loss of generality, we have 
that S is both unitary and symmetric: 


Sig=1,sT=S> 
Ae ta (8.31) 


Our aim is to show that we can unitarily transform D(-) to D’(-) by a suitable 
choice of unitary U, as in the first line of Eq.(8.29), such that D’(-) is real. The 
key is to realise that because of the properties (8.31) of S, we can express S as 
the square of a unitary symmetric matrix U: 


S =U? 
UtU=1, UT=U (8.32) 


The argument is the following: being unitary, the eigenvalues of S are all 
unimodular, and we can form an orthonormal basis out of the eigenvectors of 
S. If w is an eigenvector of S with eigenvalue e*”, then 


Sp = ey => S*y* = ey" 
=> Sy* = ey" (8.33) 


on account of Eq.(8.31). Therefore either ~* is proportional to yw, in which case 
its phase can be adjusted to make yw real; or else the real and imaginary parts 
of % are both eigenvectors of S with the same eigenvalue e*”. In any case, we 
see that the eigenvectors of S can all be chosen real, and this is also true for 
the members of an orthonormal basis of eigenvectors. This means that S can 
be diagonalised by a real orthogonal matrix O: 


S$ =0dO™', 
OTO=1, O*=0, 
d= diagonal, unitary (8.34) 


If we now take d!/? to be any square root of d, likewise diagonal and unitary, 
we see that 
U =Od'/?9Q7} (8.35) 


obeys all of Eqs.(8.32). If the expression U? is used for S in Eq.(8.27), after 
rearrangement of factors, we get 


(UD(g)U—!)* = UD(g)U~* (8.36) 


144 Chapter 8. Review of Group Representations 


showing that the representation D(-) has been brought to real form. 

The most familiar illustration of all this is in the representation theory of 
SU(2), or angular momentum theory in quantum mechanics. It is well-known 
that all integer spin representations, which are odd-dimensional, are potentially 
real, and indeed their description using Cartesian tensors gives them in real form. 
However, all half odd integer spin representations which are even dimensional, 
are pseudo real: the invariant bi-linear form in these cases is antisymmetric. 


8.6 Direct Products of Group Representations 


As a last item in this brief review of group representations, let us look at the 
process of forming the direct product of two irreducible representations of a 
group G. Let the two representations D,(-) and Do(-) operate on vector spaces 
VY, and V2 respectively. Assume they are both complex or both real. First set 
up the direct product vector space V, x V2 : it is again a complex or a real space 
depending on the natures of Vy and V2. As is well-known, VY) x V2 is spanned 
by outer products or ordered pairs of vectors of the form x ® y, where x € Vj, 
and y € V2. (The building up of V, x V2 is rather subtle, if one wants to do it in 
an intrinsic manner, but we do not go into those details here!). Then we define 
the direct product representation D(-) = Dj(-) x De(-) by giving the action on 
a “monomial” as 

D(g)(x @ y) = Di(9)z ® Do(g)y (8.37) 


and then extending the action to all of V, x V> by linearity. Evidently, we do 
have a representation of G here. 

Even if D,(-) and Do(-) are irreducible, D(.) may be reducible. If it is fully 
reducible, in its reduction various irreducible representations may occur with 
various multiplicities. This is the Clebsch-Gordan problem, and the direct sum 
of irreducible representations contained in the product Di(-) x Do(-) is called 
the Clebsch-Gordan series. 


Exercises for Chapter 8 


1. For the n-element cyclic group Cy = {e,a,a?,...,a"~1;a” = e}, find all 
irreducible unitary (one-dimensional) representations. 


2. For the group of real translations x — xz + a in one dimension, show that 


_,fi «4 
. o cl 


is a representation. Is it reducible or irreducible, decomposable or inde- 
composable? 


3. Find all one dimensional unitary (irreducible) representations of the n 
dimensional real translation group 2; > 2; + a@;,j =1,2,...,n. 


8.6. Direct Products of Group Representations 145 


4. Show that the group in problem (8) is represented unitarily on the space 
of all square integrable complex functions f(z) on R® by 


f(z) > (T(@)f)(z) = f(e- 9). 


Chapter 9 


Lie Groups and Lie 
Algebras 


We shall in this Chapter study the properties of Lie groups in some detail, 
using local systems of coordinates rather than the admittedly more powerful 
and concise intrinsic geometric methods. However as an exercise the reader is 
invited to apply the concepts of differential geometry developed elsewhere in 
this monograph, to see for himself or herself how to make this local description 
a global one. We shall see how Lie algebras arise from Lie groups, and how one 
can make the reverse transition as well. 


9.1 Local Coordinates in a Lie Group 


Let there be a Lie group G of dimension r. Then in some suitable neighbourhood 
St of the identity e, we are able to smoothly assign coordinates in Euclidean r- 
dimensional space to group elements. The dimension being r implies that these 
coordinates are both essential (i.e., we cannot do with less) and real. We use 
Greek letters for coordinates and use the convention, 


a@EMNCG: coordinates a!,a’,...,a7 =o7,j =1,2,...,7; 
bEMCG: coordinates f!, 6?,..., 8" = 6,7 =1,2,...,r 
(9.1) 
As a convention we always agree to assign coordinates zero to the identity: 


a=e:a=0 (9.2) 


As the variable element a runs over the open set Mt, a7 runs over some open 
set around the origin in Euclidean r-space. In this region, group elements and 
coordinates determine each other uniquely. 


148 Chapter 9. Lie Groups and Lie Algebras 


The freedom to make smooth changes of coordinates in an invertible way, 
a} — & say, is of course available. We must only observe that 


od =0ea =0 (9.3) 


Provided the elements a,b € St are chosen so that the product c = ab € It as 
well (and similarly later on if we need to compose more elements), the law of 
group composition in G is given by a set of r real-valued functions of 2r real 
arguments each: 


c=abey = fi(a;B), 7 =1,2,...,7r (9.4) 


Thus in each system of coordinates, the structure of G is expressed in a corre- 
sponding system of group composition functions. Our convention (9.2) means 
that these functions obey 


Ff? (050) = f7(0;a) = oF (9.5) 


The existence of a unique inverse, a~', to each a € N (provided a is chosen 
so that a~ € Nas well!) is expressed as the following property of the f’s: 
Given a, there is a unique a such that 


fi(a;a’) = fi(a’;a) =0,a 9 a,a7)! Sa! (9.6) 


A change of coordinates (9.3) will result in a new set of functions f7(.;.), say, 
still subject to Eqs (9.5), (9.6). We will assume for definiteness that the f’s are 
sufficiently often differentiable; (that this need not be assumed is an important 
result in the theory of Lie groups, but we are working at a less sophisticated 
level !). Thus smooth coordinate changes will guarantee this property for f as 
well. 

We now study the properties of f? implied by the fact that G is a group, 
in a step-by-step manner. 


9.2 Analysis of Associativity 


Take three group elements a,b,c € Nt such that the product cab € N as well. 
Associativity leads to functional equations for the f’s: 


c(ab) = (ca)b > 
Ff? (% £(05B)) = f7(F (730); B) (9.7) 


Differentiate both sides with respect to y* and then set y=0,ie. c=e: 


afi 
FETS AB)i-0 = FE (a3) 2 (ria yec (9.8) 


9.2. Analysis of Associativity 149 
ee ee eS 


Here we have used Eq.(9.5) for the first factor on the right. Now the expressions 
on the left and on the extreme right motivate us to define a system of r? functions 
of r real arguments each in this way: 


ma) = paar a) |y=0 (9.9) 


Thus, while the f’s are functions of two independent group elements, the 7’s 
have only one group element as argument. Again on account of Eq.(9.5), we 
have ; 

ni. (0) = 6 (9.10) 


so we assume § is so chosen as to make 7 (a) as a matrix nonsingular all over 
N: 


H(a) = (nj (a)) 
H(a)~? = E(a) = (€{(a)) 
nila)ér (a) = 8, 
1(0) = 63 (9.11) 


We treat superscripts (subscripts) as row (column) indices for matrix operations. 

We can say: if a system of group composition functions obeying the as- 
sociativity law is given, then matrices H(a),=(qa) inverse to one another can 
be defined; and the f’s will then obey, as a consequence of associativity, the 
(generally nonlinear) system of partial differential equations 


BF (a8) = nf (Fas BEL) 
= (H(f (a5 8))E(a))h, 
f(0;6) = (9.12) 


Here the a’s are the active independent variables while the §’s appear via 
boundary conditions. 

We can now exploit this system (9.12) to our advantage. Suppose the 
composition functions f? are not known, but instead the functions 7, € are given, 
and one is also assured that the partial differential equations (9.12) can be solved 
for the f’s. Then the structure of these equations guarantees that, once the f’s 
have been found, they will have the associativity property, will provide inverses 
etc, so that they will describe a group! 

It is perhaps useful to describe this situation in another way. If some 
“arbitrary” nonsingular matrix of functions (7) is given to us with inverse matrix 
(€), and we then set up the system (9.12) for unknown /’s, these equations will 
in general not be soluble at all ! The 7’s must possess definite properties if 
eqs (9.12) are to possess a solution; these will be obtained in Section 9.4. But 
assuming for now that an acceptable set of 7’s are given, we can show that the 


150 Chapter 9. Lie Groups and Lie Algebras 


f’s do have the correct properties to describe a group (at least locally!). This 
we now prove: basically one just exploits the fact that (7) and (€) are inverses 
to one another. 

Write Li(7) and R?(-7) for the expressions on the two sides of Eq.(9.7), and 
regard -y as the active variables while both a and @ are left implicit. We can 


develop systems of partial differential equations and boundary conditions for 
both L(y) and R(7), based on Eqs (9.11),(9.12): 


L(y) & f(y; f(05B)) 
ae) = (H(L)3())4, 
Li(0) = f(a 8); 
Ri(y) = f2(F (va); 8): 
OR of of 
tn) = oF Biles Sep (72) 
= (A(RES (rsa) HF (i a))E)), 
= (H(R)E(7))i, 
FR} (0) = f?(a; B) (9.13) 


Since L(y) and R(y) obey the same partial differential equations and boundary 
conditions with respect to , and these equations have a unique solution, we 
must have 


L(y) = R?(y) (9.14) 


Thus associativity is guaranteed. 

How about inverses for group elements? Since (7) and (€) are both nonsin- 
gular, so is the Jacobian matrix (Of (a; 8)/da). So given a, there is a unique a! 
such that 


fi(a;a’) =0 (9.15) 


(Remember again that here we are viewing the 7’s as given, have found the f’s 
by solving Eqs.(9.12), and are establishing various properties for them!). These 
a’ are the coordinates of a~1, where a@ are the coordinates of a. But while 
we have determined oa’ by imposing the condition aa~! = e, will we also have 
a~'a =e? In other words, will the a’ obtained by solving Eq.(9.15) also obey 
f(a’; a) = 0? Yes, because of the associativity law (9.14) already proved! Using 
the rule (9.15) twice, let us suppose we have: 


ava fil(aja’)=0 a7) ~a'; 


at wa! = fi(a’;a") =0 > (a73)7! wo" (9.16) 


9.3. One-parameter Subgroups and Canonical Coordinates 151 


Then associativity gives: 


f(a; f(a’;a”)) = f(f(a;a’); 0”), 


ie., ata =a '(a-!)-! =e. (9.17) 


To sum up, if a Lie group G and a coordinate system a are given, then 
the f’s are known, the 7’s and €’s can be computed, and they will of course be 
acceptable. Conversely, if acceptable 7’s and é’s are given, we can build up the 
f’s uniquely, and thus reconstruct the group, at least locally. 


The importance of the systems of functions 7, € must be evident. An inter- 
pretation for the 7’s can be developed thus: 


Following from the definition (9.9), for small a we can write 
f7(a; 8) = 8? +n} (B)a* + O(a?) (9.18) 


So, if a is an element close to the identity e, with coordinates a, and 3 is 
some (finite) element with coordinates (, then ab is close to 6 and has coordinates 
which differ from 3 by the amount H(6)a: 


a near e,b general = b' = ab near 8, 
Bi ~ B+ H(B)a+ O(a?) (9.19) 


9.3. One-parameter Subgroups and Canonical 
Coordinates 


A one parameter subgroup (OPS) in a Lie group G is a continuous and differ- 
entiable family of elements a(o) € G, where o is a real parameter, forming a 
subgroup: 


a(c)a(o’) = a(o +0"), 
a(0) =e, 
a(c)~! = a(-a) (9.20) 


(It can be shown that an OPS, defined more weakly than here, is necessarily 
abelian, and that the parameter can be chosen so that group composition cor- 
responds to parameter addition; for simplicity, all this has been made part of 
the definition of an OPS). 

We now subject this concept to the same kind of treatment as the group 
itself in the previous section. At first, given G, and an OPS in it, we develop 
a way to describe the latter. Then later we turn the situation around, and use 
the description as a way to determine an OPS (G being given). 


152 Chapter 9. Lie Groups and Lie Algebras 
Given G and a(c), in the coordinate system a let a(o) have coordinates 
a(o): 


a(a) + a3 (oa); 
a? (0) = 0; 
f(a(o’);a(o)) = a(o’ +0) | (9.21) 


Now take o’ to be an infinitesimal 6a, and also write 


, dod 
J = = 
= (leno, 
od (6a) ~ ba (9.22) 


Then from Eq.(9.21), 


a (a + 60) = fi (a(5c); a(c)) 
~ fi (t50; a(c)) 
~ oF (a) + 7) (a(o))t*5o (9.23) 


so the coordinates of the elements on an OPS obey the (generally nonlinear) 
ordinary differential equations and boundary conditions 


(0) = ni(alo)), 
ai (0) = 0 (9.24) 


We say t? are the components of the tangent vector to the OPS at the identity. 
Can the above be reversed? Given G, the f’s, 7’s and €’s, suppose we choose 
an r-component real vector #7, and solve the system (9.24) and get functions 
ai (¢). Will these be the coordinates on an OPS and will its tangent at the 
identity be £? The answer is in the affirmative, and the proof is similar to that 
in the previous section. We wish to prove that, given Eqs.(9.24) with solution 
a (9), 
D(a) =o (a +0") = f?(a(a); a(o’)) = Ri(c) (9.25) 


Here we regard o as active and o’ as implicit in L and R. Then eqs (9.24),(9.12) 
give: 


(0) = ni(L(o))t", 

D}(0) = a? (0'); 
dRi es jak 
(0) = (H(R(o))(a(o)) H(a(o)) 4", 


= m(R(o))t*, 
R3(0) = a (0’) (9.26) 


9.3. One-parameter Subgroups and Canonical Coordinates 153 


Since both the differential equations and boundary conditions are the same 
for L and R, they coincide and the equality (9.25) is proved. 


In summary, a given OPS determines a tangent vector f uniquely and vice 
versa, based on the differential equations (9.24). 

So far we have allowed considerable flexibility in choice of coordinate sys- 
tems a for G. The properties of OPS’s motivate the definition of a special family 
of coordinate systems, the so called canonical coordinates (of the first kind). At 
first, let us denote the solution to Eqs.(9.24) by a(o;t), indicating explicitly the 
dependence on the tangent vector ¢: there are r +1 real arguments here. But a 
moment’s reflection shows that the dependence can only be on the 7 products 
ot!, ot?,...,0t", so we will write the solution to Eqs.(9.24) as: 


a (0; t) = x3 (ct) 
=r functions of r real independent arguments; 
x7(0) = 0, 
xi (t50) ~ 50 + 0(50)?. (9.27) 


In fact the differential equations (9.24) for an OPS can be expressed as a system 
of equations obeyed by the functions x: 


er (a) = nhc) 
x(0) =0 
Ox? 


Bek lexo = 2) 


Thus, given G and a coordinate system a around e, we have arrived at the 
system of functions x(t). Now the Jacobian property of x above means that if 
any element a € St (sufficiently small) with coordinates a/ is given, we can find 
a unique vector t? such that 


XI7(t) =a (9.29) 
In other words: for any a € Nt, there is a unique tangent vector ¢ such that the 
corresponding OPS passes through a at parameter value o = 1. We express this 
relationship in the (at this point somewhat formal) exponential notation: 
a with coordinates a? : a = exp(t), 


af = a (0;t)|e=1 = x7 (t). (9.30) 


Thus in this notation the elements a(c) on an OPS with tangent vector ¢ are 
exp(ot), and the OPS property means 


exp(ot) exp(o’t) = exp((o + 0’)t). (9.31) 
All a € ® are obtained by suitably varying t in exp(t). In particular, 
e = exp(0), 
(exp(t))~! = exp(—t) (9.32) 


154 Chapter 9. Lie Groups and Lie Algebras 


Can we now exploit these results to pass from the initial arbitrarily chosen 
coordinates oJ to a new system of coordinates in which not only are #? the 
components of the tangent vector to the OPS passing through a given a € Nt at 
o = 1, but also serve as the coordinates of a? Yes, because once the functions 
x(t) are in hand, we can in principle turn the equations 


a = x3 (t) (9.33) 


“inside out” (the Jacobian property of x in Eq. (9.28) ensures this), and express 
t in terms of a: ; ; 
? = ¢/(a) = new coordinates a (9.34) 


The distinguishing characteristics of these new coordinates & for 3t C G are 
that on an OPS a(o): 


a(a) > old coordinates a?(c¢) = x/(at) > 


new coordinates & (a) = at? (9.35) 


Such coordinates are called canonical coordinates. If we were to write f,7,€ 
for the functions associated with them, the characterisation (9.35) means that 
now the solution to Eqs.(9.24) in the new coordinates is very simple: 


20) — MB(a(o))#", 2200) =0> 


&(c) =ot? > 


ie, HA(a)a=a. (9.36) 


We thus have two direct ways of defining a canonical coordinate system: ei- 
ther (i) the functions 7 must have the functional property appearing in Eq. (9.36) 
or (ii) the elements on an OPS have coordinates ct. 

The passage from a general coordinate system a to the canonical system 
& uniquely associated with it has the feature that upto linear terms they agree, 
and (possibly) diverge only thereafter: 


a ~ a) + 0(a7), 
a) ~ & + 0(a’) (9.37) 


(In particular, if a were already canonical, then & = a!) We can say that while 
there are “infinitely many” general coordinate systems for 3t C G related to 
one another by functional transformations, there is a many-to-one passage from 
general to canonical coordinate systems; these are “far fewer” but still infinite 
in number, and related to one another by linear transformations. Of course the 
set of canonical coordinate systems appears already as a subsct of the set of all 
general coordinate systems; and all these relationships hinge on the concept and 
properties of OPS’s. 


9.4. Integrability Conditions and Structure Constants 155 


9.4 Integrability Conditions and Structure 
Constants 


Now we consider the conditions on the 7’s and €’s which ensure that Eqs.(9.12) 


for f’s can be solved. Regarding the @’s as fixed parameters, these integrability 
conditions are: 


d. 9 a a 
da™ Sao . at Rarer 
pe som (nl NEka)) = soe ln Ne(@)) (9.38) 


We can develop both sides using Eqs.(9.12) again, and then rearrange terms so 
that all expressions evaluated at f are on one side, those evaluated at a on the 
other. Partial derivatives are indicated by indices after a comma, and the result 
is: 


Cay. ne (f)nien(f) — ne (F)nzn (FI 
= 5 ( a) nF (&) [Erna n () — Ei. m(a)] (9.39) 


Since the two sides are evaluated at different points in St (remember the {’s!), 
each side must be a constant indered by j, k and l. Let us write —c!,, for these 
constants: then the integrability conditions for Eqs.(9.12) are that there must 
be these constants and 


a)[n? (a)nga(a) — ne (a), (a)] = che, (a) 
ni (ang (a) [Ebr n (a) — i m(@)] = Che (b) (9.40) 


These two conditions are identical in content, on account of (7)~! being 
(€). It is convenient to express them in the forms 


ne (o)ne (a) — ne (ane, (a) = —ch,.ni"(a) (a) 
Ab n(@) = Ga nl) — En m(@) + chub (@ER(Q) = 0 (b) 
(9.41) 


An “acceptable” set of functions 7(a) must obey these partial differen- 
tial equations for some numerical constants Cys if they do, then we can solve 
Eqs.(9.12) to get f7(a; 8), and these will determine the group structure (at least 
locally). 

The cl yk are called “structure constants” for the group G. What are accept- 
able sets of structure ponstenie They cannot be chosen freely. If the group G 
were given, the corresponding ce can be calculated, and will be found (as we 
shall sce) to obey certain algebraic conditions. In section 9.6, we will see that 
all sets of constants obeying these algebraic conditions are acceptable. 

We outline in sequence the properties of the r? constants Chie which by 
definition are real numbers, assuming that G is given: 


156 Chapter 9. Lie Groups and Lie Algebras 


(i) Antisymmetry in the subscripts: obvious from Eqs.(9.40): 


(ii) Taking a = 0 in Eq.(9.41a), we have the explicit expression 


chy = 5,4(0) — 1k, (0) 
ey ee eee De 
7 (sr acr — BF 3) F'(8;@)|e=8=0 (9.43) 


Therefore if we expand f(@; a) in a Taylor series as 
P(Ba=B+abt+al, Pat +---, (9.44) 


then, 


t 
Chey 


= aes - a\, (9.45) 
(iii) Relation to commutators of group elements: take two clements a,b € 
M both close to e, and calculate the coordinates of the commutator q(a, 6) 


(Eq.(7.16)): retaining leading terms only, we find: 


c= q(a,b): 
Y=caksl+... (9.46) 


This shows that if G is abelian, all the structure constants vanish; the converse 
is also true, as we will find later. 

(iv) Jacobi condition: we exploit the integrability condition (9.4la) on 7 
by multiplying it by two tangent vectors t7u* and writing terms in a suggestive 
form: 


tnf(a) so au* nf (a) — utnf(a) gaotinf (a) = -eh,tunft(a) (9.47) 


From experience in handling linear first order differential operators in quantum 
mechanics, we see that we can express this as a commutation relation: 


. oO , 3) 1 
Onpla) soe vutne(a) ge] =—v'aMa)zae, oh edytut (0.48) 
That is, if for each tangent vector t we define the linear operator 


They = Un (a)=— (9.49) 


im 


then the integrability conditions on 7 are, 


[rnpe} Mey) = Mey 7Mfeal — Mey Me] 
= Mo] (9.50) 


9.5. Definition of a (real) Lie Algebra: Lie Algebra of a given Lie Group 157 


where v is another tangent vector formed out of t and u using the structure 
constants: 


v = [t, u] = Lie bracket of ¢ with u, 

v! = ch,tiuk (9.51) 
We have introduced here the notion of the Lie bracket, an algebraic operation 
determining a tangent vector [t, u] once ¢ and u are given, at this moment to be 
distinguished from the commutator bracket occurring in Eq.{9.50). But, since 
commutators among operators do obey the Jacobi identity, Eqs.(9.50),(9.51) 
lead to the result that Lie brackets must also obey them! For any three vectors 
t,u and w, we have 


S > (lemte}s Mut]: Mwy) = 0 > 


cyclic 
S> [neu], Mf] = 0 > 
cyclic 
y AMft,u},w) = o> 
cyclic 
[[e, u], w] + [[w, w], 4] + [[w, t],u] =0 (9.52) 


Here we have used the fact that jy depends linearly on t. Directly in terms of 
the structure constants, the Jacobi condition is 


idan + On Car + Cuckm = 0, all jkmn (9.53) 


The complete set of conditions on structure constants which makes them ac- 
ceptable are: reality, antisymmetry, Eq.(9.42); and the Jacobi identities (9.53). 
There are no more conditions. All these properties of structure constants, re- 
flected also in the way the tangent vector v is formed out of ¢ and u, motivate 
us to define the concept of the Lie Algebra. 


9.5 Definition of a (real) Lie Algebra: Lie 
Algebra of a given Lie Group 
A (real) Lie algebra is a real linear vector space, L£ say, with elements u, v,?,w,... 


among which a Lie bracket is defined. This is a rule which associates with any 
two given elements of £ a third element, subject to certain conditions: 


(i) u,veEL=> [u,v] €L; 

(ii) [u,v] = —[v, ul}; 

(iii) [u,v] linear in both u and 0; 

(iv)  u,v,w € £> [u,v], wl + [[v, w], ul + (lv, uw], 7] =0 (9.54) 


158 Chapter 9. Lie Groups and Lie Algebras 


If a basis {e;},7 = 1,2,...,r is chosen for £, the dimension of £ being r, 
then because of the linearity property (9.54) (iii), 


u=we.u=ve; > 
[u,v] = we*le;, ex] 
= wokch.er, 
(ej, ex] = cjners 


Le., [u, v]’ = ci wy* (9.55) 


For these structure constants ch. of £, properties (9.54) (ii),(iv)) then im- 
mediately give back the properties (9.42),(9.53) we had found for the structure 
constants of a Lie group! 

We thus see that the definition of a Lie algebra is designed to capture 
in intrinsic form all those properties that are known to be possessed by the 
structure constants of any Lie group, in particular the process by which in 
Eqs.(9.50),(9.51), any two tangent vectors determine a third. We can now say: 
given a Lic group G of dimension r, the tangent vectors u,v,w,t,... at e to 
all possible OPS'’s in G taken together form an r-dimensional Lie algebra de- 
termined by G. We call it the Lie algebra of G. It is sometimes denoted by 
G, sometimes by g. Changing the coordinate system a over % C G for the 
calculation of the structure constants hp results in a linear transformation, or 
rather a change of basis, in G. 

It is the group structure for elements in G infinitesimally close to e that 
determines the bracket structure in G. Therefore if we have two (or more) 
Lie groups G,G’,... which are locally isomorphic to one another in suitable 
neighbourhoods of their respective identity elements, they will all lead to, and 
so they will all share, the same Lie algebra £. One can thus at best expect 
that a given Lie algebra can determine only the local structure of any of the Lie 
groups that could have led to it. We discuss this reconstruction process in the 
next section. 


9.6 Local Reconstruction of Lie Group from 
Lie Algebra 


Suppose a Lie algebra C, that is, an acceptable set of structure constants Chips is 
given. Can we then say that we can compute the functions nh (a)? If the answer 
were in the affirmative, then the analysis in Section 9.2 tells us that we can go 
on to compute f(a; 8) too, so from the structure constants we would have built 
up a Lie group G at least locally. 

But actually we know in advance that the 7j,(a) cannot be found if only 
Che are given! Suppose you already had a Lie group G in hand; clearly there 
is infinite freedom to change the coordinate system on Jt C G leaving the 
coordinates around e unchanged to first order only. Such changes of coordinates 


9.6. Local Reconstruction of Lie Group from Lie Algebra 159 


over Nt would definitely change the 7} (a) but not the cl. So unless we restrict 


the coordinate system in some way, we cannot hope to get 7.(a) starting from 
Ch. 

The key is to use canonical coordinates of the first kind. It is possible, given 
an acceptable set of structure constants Cn to solve uniquely the combined 


system of partial differential equations and algebraic equations, 


Ai an(@) = Ea n(@) — Eh m(a) + bp €h (@)EK (a) = 0, 
E(a)o* = al, 
EL (0) = 6} (9.56) 


While the details can be found in various references, here we simply describe 
the results. We define a real r x r matrix C(a@) by 


Ci(a) = cj, a! (9.57) 


the superscript (subscript) being the row (column) index. Then the matrix E(a) 
is given as an infinite power series, in fact an entire function, in C(a): 


E(a) = (&(a)) = (= ) vcr 


=a 7(Clay*/(n +1)! (9.58) 
n=1 


Because of the antisymmetry property of the structure constants, the condition 
reflecting the use of a canonical coordinate system is satisfied. It is in the 
course of developing this solution (9.58) that one sees that no more conditions 
need to be imposed on structure constants beyond antisymmetry and the Jacobi 
identities. The matrix H(a) inverse to Z(a) is again a power series which (in 
general) has a finite radius of convergence: 


H(a) = (ni (a)) 
= (z/(e? - Nee) 
7 Sis +5 cous — eee) 


In all these expressions, the basic ingredient is the matrix C(a). Its intrinsic 
property is that acting on a vector G, it produces the Lie brackct of a with (: 


C(a)8 = [a, | (9.60) 


Now that an acceptable set of functions mi (a) has been constructed in 
canonical coordinates, it is a routine matter to solve the partial differential 
equations (9.12), and obtain the group composition functions f(a; 3), also in 
canonical coordinates. The result is the so-called Baker-Campbell-Hausdorff 


160 Chapter 9. Lie Groups and Lie Algebras 


series. Remembering that we are using canonical coordinates, so that a, 67,... 
are both coordinates for group elements and components of tangent vectors, one 
finds 


1 1 
f(a; 8) =a+ 6+ Sle, 8] + solo — B,la, Bl] +... (9.61) 
The intrinsic, coordinate free form for this composition law for G reads: 


u,v,...€£— exp(u),exp(v),... EG: 
exp(u) exp(v) = exp(u+u+ x(u,0] + al —v, [u,v] +...) (9.62) 


In summary, we saw in sections 9.4 and 9.5 that a given Lie group G 
with coordinates a over a neighbourhood © allows calculation of its structure 
constants, and thus the setting up of its Lie algebra G. Conversely in this section 
we have indicated how, given the structure constants and working in a canonical 
coordinate system, we can build up (a), 7(a), f(a; 8) for small enough a, 3, and 
thus locally reconstruct the group. 

We find in both forms (9.61),(9.62) of the group composition law that after 
the two leading terms, all higher order terms involve (repeated) Lie brackets. 
This immediately establishes the converse to a statement we made in section 
9.4 (at Eq.(9.46)): if the structure constants vanish, the group is Abelian. 


9.7 Comments on the G— G Relationship 


It was already mentioned in section 9.5 that the Lie algebra G associated with 
a Lie group G is determined by the structure of G “in the small”, i.e., near the 
identity element. Therefore two (or more) locally isomorphic groups G,G’,..., 
which all “look the same” near their identities, but possibly differ globally, will 
share, or possess, the same Lie algebra: G = G’ = .... The global properties of 
G do not show up in G at all. 

Examples familiar from nonrelativistic and relativistic quantum mechanics 
are: SU(2) and SO(3); SL(2,C) and SO(3,1). If then some Lie algebra L is given, 
in the reconstruction process, which of the many possible globally defined Lie 
groups can be uniquely singled out, as the result of the reconstruction? Of all 
the Lie groups G,G’,... possessing the same Lie algebra £, only one is simply 
connected, and it is called the universal covering group of all the others. For 
instance, SU(2) (SL(2,C)) is the universal covering group of SO(3) (SO(3,1)). 
Denoting this topologically distinguished group by G, we can say that £ leads 
unambiguously to G. In other words, structure near e plus simple connectivity 
fixes G completely. 

All of the general group theoretical notions, when specialised to Lie groups, 
in turn lead to corresponding notions for Lie algebras. While we look at many 
of them in the next section, we dispose of the simplest one now. This is the 
question: When are two Lie algebras L, £L’ to be regarded as “the same”? How 
do we define isomorphism of Lie algebras? This requires the existence of a one- 
to-one, onto (hence invertible) linear map y: £L — L’ (so as real vector spaces, 


9.8. Various Kinds of and Operations with Lie Algebras 161 


£ and £’ must have the same dimension) compatible with the two Lie bracket 
operations: 


zyELl: (x), p{y) €L; 
v(lz, ylc) = [v(z), pyle’ (9.63) 


We can then say: 
G and G’' locally isomorphic + G and G’ isomorphic (9.64) 


In a similar way, homomorphisms and automorphisms for Lie algebras can 
easily be defined — we leave these as exercises for the reader. 


9.8 Various Kinds of and Operations with Lie 
Algebras 


We now briefly go over the notions of subalgebras, invariant, solvable, etc as 
applied to Lie algebras. 


(i) A Lie algebra £ is abelian or commutative if all Lie brackets vanish: 
[x,y] = 0 for allz,yeELl (9.65) 


(ii) A subset £L’ C £ is a subalgebra if, firstly, it is a linear subspace of L 
viewed as a real linear vector space, and secondly, it is a Lie algebra in its own 
right: 

z,yEl abeRaaxr+bye Ll [zylel’ (9.66) 
It is a proper subalgebra, if it is neither £ nor the zero vector alone. 

(iii) A subalgebra £’ C L is an invariant or a normal subalgebra (also called 

an ideal), if 
cel yells {zjylel (9.67) 


This can also be suggestively expressed as 
Lever (9.68) 


In such a case, if we project £ as a vector space with respect to £’, and go 
to the quotient £/£' which is a vector space, it is also a Lie algebra. It is called 
the factor algebra. (Verify this!). 

(iv) Given a Lie algebra £, its commutator subalgebra £, consists of all 
(real) linear combinations of Lie brackets [z, y] for all choices of x,y in L. It is 
also called the derived algebra of £, and can be suggestively written as 


£, = [L£,L] (9.69) 


You can easily verify that £) is an invariant subalgebra of £, and also that 
£/L is an abelian subalgebra. 


162 Chapter 9. Lie Groups and Lie Algebras 


(v) Given the Lie algebra £ of dimension r, form its commutator algebra 
£, = [L,L}, then Le = [£1, £1] = commutator algebra of £1, £3 = [L2, Le], and 
so on, £341 = [£;, £;] in general. In the series of Lie algebras 


Lo HLL, = [Lo, Lol, Lo= [Lushai Cj = = (£5; £;], site (9.70) 


at each stage £j41 is an invariant subalgebra of £;, and £;/£ 41 is abelian. 
If for a particular j,£;41 is a proper subalgebra of £;, then the dimension of 
£41 must be at least one less than that of £;. On the other hand, if for some 
(least) j, we find £;41 = £;, then of course £j41 = L342 = £Lj43 =... as well, 
so the series stabilises at this point. Therefore, in the series (9.70) either, (i) 
the dimension keeps decreasing until we reach £; for some j < r, and thereafter 
the series is constant at a nonzero “value”, £; = £341 = Lj42 =... #0; or (ii) 
the dimensions keep decreasing until, for some (least) 7 < r,£; = 0, and then 
of course £341 = Ljyyo =-.. = 0. 

The Lie algebra £ is solvable if for some (least) j <r, £; = 0. 

(vi) Another sequence of subalgebras which we can form is: we go from 
Lo = £L to L, as before, but write £! for it: 


L}=L, = (Lo, Lol (9.71) 
Next, we form 


= [£o, £*] =all (real) linear combinations of 
[x,y] with z € Lo, ye L' (9.72) 
It is an instructive and not too complicated exercise to show that £2 is an 
invariant subalgebra of Lo as well as of £!, and £L1/L? is abelian. At the next 
step we define 
£3 = (Lo, £L*] =all (real) linear combinations of 
[x,y] with z € Lo, y € L? (9.73) 
This in turn is seen to be an invariant subalgebra of Lo, £1 and £2, with 
£?/£3 being abelian. In this way we have the series: 
£o = L, Dg = L,= [Lo, Lol], Le = (Lo, FS |yas 
LI+! = (£9, L£°), ... (9.74) 
£3*) is an invariant subalgebra of Lo, £1, £?,..., £7 and £L1/LI+ is abelian. 
The Lie algebra L is nilpotent if £7 = 0 for some (least) j < r. 
(vii) Connection between solvability and nilpotency: We have the two series 


(9.70),(9.74) of Lie algebras above. While £; = £! by definition, one can easily 
show (by induction for example) that 


£;CL),j =2,3,... (9.75) 


9.8. Various Kinds of and Operations with Lie Algebras 163 


This means that if £7 = 0 for some j, then by that stage (or earlier) £; = 0 
too. So, 


£ nilpotent > £ solvable, but not conversely (9.76) 


While the above concepts are important for the general theory of Lie alge- 
bras, for most physical applications the simple and semisimple types are more 
important. We next define such Lie algebras. 

(viii) A Lie algebra £L is simple (semisimple) if it has no proper invariant 
(abelian invariant) subalgebra. We can immediately see the following relation- 
ships: 


£ simple > £L semisimple but not conversely; 
£ solvable = £; = proper invariant subalgebra of £ 
=> £ not simple; 
Lsimple > L; = £> CL not solvable (9.77) 


So, simplicity and solvability arc mutually exclusive! To conclude this sec- 
tion, let. us define direct and semidirect sums of Lie algebras. 

(ix) Direct sum of Lie algebras: Let £’ be a proper invariant subalgebra 
of a Lie algebra £. If we can find another proper invariant subalgebra £” in £ 
such that as vector spaces £ = £' @ £L” in the sense of a direct sum, then we say 
L£ is the direct sum Lie algebra of L' and £”. Since in this situation the only 
element common to £’ and £” is the zero element, we must have: 


[c’,£C £1, [0', £7] =0,(£",.07] SL" (9.78) 


(x) The semidirect sum of Lie algebras parallels the semidirect product 
of groups. We say £ is the semidirect sum of £’ and L” if as vector spaces 
L=L'@L", and as Lie algebras 


[L’, £'| C ve 
ay og | C L", 
cheer (9.79) 
This means that while both £’ and £” are subalgebras of £, the latter is an 
invariant one; and one can then see that the factor algebra £/£” is isomorphic 


to £’. From the second line in Eq.(9.79) we also sec that each element of L’ 
acts as an (in general) outer automorphism on £”. 


Exercises for Chapter 9 


1. For the group SO(3) of real proper orthogonal rotations in 3 dimensions, 
show that we have various possible parametrisations of group elements: 


164 Chapter 9. Lie Groups and Lie Algebras 


(i) Euler angles: 


cos@ —sin@ 0 
ReESO(3):R = Rip, 6,6) = | sin@ cos@ 0 
0 


0 1 
cosy QO sing cos@ —sing 0 
x 0 1 0 sing cos@ O |, 
-siny 0 cosy 0 0 1 


0<0,¢<27,0<H<z. 


(ii) Axis angle parameters: 
R(f,a) = (Ryx(h, a)), j k=1,2,3, 
Ryx(fi, a) = 5;~ cosa + njng(1 — cosa) — €;412 sina, 


AES*?,0<a<n. 


(iii) Homogeneous Euler parameters: 
Rjx.(ao,@) = (a2 — a”) 5j + 2ajax — 2aoe;i1A1, 


a a 

a = cos =,@; = nj; sin —,a2 +a? =] 
2 2 

In each case, find the parameter values for the identity element. Are they 

always unique? 


2. For the group SO(3) described using axis angle parameters R(n, a), show 
that by keeping # fixed we obtain a one-parameter subgroup. What is the 
range of a? 


3. Develop analogues of (i), (ii), (iii) of problem (1) above for the group 
SU(2). Find all possible one-parameter subgroups in this case. 


Chapter 10 


Linear Representations of 
Lie Algebras 


We have reviewed in Chapter 8 some aspects of the theory of group representa- 
tions. When applied to Lie groups in particular, they lead to linear representa- 
tions of Lie algebras. We devote this relatively short Chapter to this topic. 

Let a Lie group G and its Lie algebra (sometimes written as G and some 
times in a general context as £) be both given. To get a linear representation 
of £, we must start with a (real or complex) linear vector space V, of dimension 
n say, and set up the following association: 


Elements in £ — linear transformations on V, 
Lie bracket in £ — commutator of transformations on V (10.1) 
Why is the abstract Lie bracket in £ “realised” as the commutator of linear 
operators in a linear representation? Go back for a moment to the Lie group G: 
actually the ensuing operations refer only to a suitable neighbourhood NT C G 
containing the identity. If we had a representation of G on V, then to the 
element a € N with coordinates a we would have the representation matrix 
D(a) say, obeying: 
D(a@)D(B) = D( f(a; B)) 
D(0) =1 (10.2) 
What would the elements on an OPS look like in the representation? Assume 
that the coordinate system is a canonical one, and for small a, suppose 
D(a) ~ 1+ oF X; + 0(a7) (10.3) 


Thus, given a basis for £, {e;} say, the way the a’s are enumerated is fixed: and 
in the representation on V we find that to each e; there corresponds a linear 
operator X; on V: 


e; € £— linear operator X; on V (10.4) 


166 Chapter 10. Linear Representations of Lie Algebras 


These are called the generators of the representation. Next, for two elements 
a,b € N, both close to e, the commutator group element g(a, b) has coordinates 
given by Eq.(9.46). Therefore we have the condition, 


D(q(a, )) = D(a) D(8)D(-a) D(—B) 
~1l+cake'x;+... (10.5) 


If we put in expressions like (10.3) on the left hand side and retain terms at 
most linear in a and 8, we immediately find: 


ak X1.'X, — B'Xia* X, = ha®B'X;, 
ie, (Xp, Xi] = XnXr— XiXe = Xj (10.6) 


And then the elements on an OPS are of course realised as “ordinary exponen- 
tials”: 


u-X u-X 2 
D(exp(u)) = 1+ 7 ear... 
a erx 
uX =wX; (10.7) 


One sees how the up-to-now formal exponential used in relating vectors in £ to 
elements in G via OPS’s becomes the ordinary exponential in a linear represen- 
tation. In other words, if we have a linear representation of £ on V, with the Lie 
bracket of £ interpreted as the ordinary commutator of generators acting on VY, 
then (apart from global aspects) upon exponentiation we get a representation 
of G: in fact since every a € M does lie on some OPS, the result (10.7) does fix 
the way all elements in St C G are to be represented. 

(A more precise statement about global aspects is this. A hermitian repre- 
sentation of the Lie algebra £ of a Lie group G yields, upon exponentiation a 
true representation of G, the universal covering group of G, and in general not 
of G itself). 

This situation can be “turned around” in the following sense. Many groups 
of practical interest are defined via a specific linear representation on some defi- 
nite space Y (a defining matrix representation), which is declared to be globally 
faithful. Then we can in principle “look” at the family of matrices involved, find 
independent real parameters (at least locally) for them, evaluate the generator 
matrices in that representation, and from their commutation properties “read 
off’ the structure constants needed to search for other representations! In fact 
this is the most practically convenient way to handle the orthogonal, unitary 
and symplectic groups, as we will sce later. 

At this point, a matter of convention and notation needs to be explaincd. 
In quantum mechanical applications unitary representations of groups play an 
important role. It is clear from Eq.(10.7) that for D(a) to be unitary, (assuming 
Y is equipped with the appropriate inner product), the generators X; must be 
antihermitian. It is however usual to remove explicitly a factor of 7 and deal 


167 


with hermitian generators. Hereafter we shall uniformly adhere to this quantum 
mechanical convention. In the context of unitary (hermitian) representations of 
G(L), we shall instead of the previous Eqs.(10.3),(10.6),(10.7) adopt the follow- 
ing practice: 


Unitary representation of G + Hermitian representation of £ (a) 

D(a) ~ 1— ia? X; + O(a’); (b) 

ej > Xj = Xi, (c) 

(ej, ex] = cyner > (Xj, Xu] = teh, Xi (d) 

a = exp(ule;) + D(a) = exp(—iu? X;) (e) 
(10.8) 


Alternatively we could say that the abstract Lie bracket in £ goes into —i 
times the commutator. 

A real orthogonal representation of G (on a real Y with appropriate inner 
product) leads to generator matrices X; which are hermitian and antisymmetric, 
hence purely imaginary. To emphasize that a common quantum mechanical 
convention is being used, we express the situation thus: 


Unitary group representation ~ hermitian generators, 


— xt. 
Xj = Xj; 
Real orthogonal group representations «+ hermitian antisymmetric generators, 
T 
x} = Xj, Xj =—-X;; (10.9) 
Real representation «+ pure imaginary generators, X} = — Xj. 


In fact, even for non unitary non-real orthogonal representations, the same 
conventions (10.8) apply. In this most general case, X; are neither hermitian 
nor antisymmetric imaginary. 

For Lie algebra representations, all the notions of invariant subspaces, irre- 
ducibility, reducibility, decomposability, direct sum etc can be directly carried 
over from the discussions of Chapter 8. Other than these, the passages to the 
contragredient, adjoint, and complex conjugate group representations are re- 
flected at the generator level thus: 


D—-(D")"*: X; + —X7; 
D> (Dt)-1: X; 3 XI, 
D— D* : Xj; > -Xj (10.10) 
In these rules, the “quantum mechanical 2” has been properly respected. 


We finally take up in this Chapter a particular real representation of a Lie 
algebra £ and “its Lie group G” (locally), which is intrinsic to the structure of 


168 Chapter 10. Linear Representations of Lie Algebras 


£ and is called the adjoint representation. The beauty of this representation is 
that C itself serves as the representation space Y, on which both £ and G are 
made to act! For each z € £, define a linear transformation ad -z which acts on 
£ in this way: 


(ad. zr) “yy y = [x,y], 
Le., (ad. x)y = {x, y] (10.11) 


Here the abstract Lie bracket in £ has been used, and the linearity prop- 
erties of this bracket ensure that ad. x is a linear transformation. Do we have 
here a representation of £? Yes, by Jacobi! 


(ad. 2)(ad. y)z ~ (ad. y)(ad. 2)z = [2, ly, 2]] — tus [2,2] 
= [[z,9},2] 
= (ad. [x,y])z, 
[ad. x, ad. y] = ad. [z, y] (10.12) 


(The brackets on the two sides of this last line are formally different!). In a basis 
{e;} for £, the generator matrices are the structure constants (times a factor of 


i): 


(ad. e;)ex = [e;,€%] = HReL = 


(Xx (adj). =ix coefficient of e; in (ad. e;)ex 


= ich, (10.13) 


Then the Jacobi identity (9.53) for the structure constants amounts to the gen- 
erator commutation relations: 


x (di), x (add), = ict, xO) (10.14) 


In the expressions in Eq.(10.13), while one of the subscripts on the structure 
constant is used up to enumerate the X’s, the remaining superscript (subscript) 
acts as a matrix row (column) index. 

Exponentiation of the adjoint representation of £ leads of course to a real 
local representation of G on CL itself. This, in its global aspect, can be called 
the adjoint representation of the simply connected universal covering group G, 
which £ determines unambiguously. Alternatively, one also often defines as the 
adjoint group of £ that Lie group which possesses £ as its Lie algebra, and for 


which by definition the set of all exponentials of the generators iu? X (adj , and 
products thereof, give a faithful representation. 


The adjoint representation will play an important role in all our consider- 
ations. 


169 


Exercises for Chapter 10 


1. Show that among functions f(g, p),9(g,p),--. on a classical mechanical 
real 2n-dimensional phase space, the Poisson bracket 


=$: (Ufo _ 8 80) 
{f,9} Da Op; Op; 0q; 


is a realisation of the Lie bracket concept. Similarly show that among 
three-dimensiona! real vectors a,b,..., the vector product operation ° 


a,b—+ arb 
is also a realisation of this concept. 


2. From the quantum theory of angular momentum we know that the Lie 
algebra commutation relations for generators of SO(3) are 


[J5, Je] = tejnrJi, 
€jkt = completely antisymmetric, €123 = 1. Using Eq.(10.13), verify that 


the adjoint representation of SO(3) is its real 3 dimensional defining rep- 
resentation. 


Chapter 11 


Complexification and 
Classification of Lie 
Algebras 


So far we have dealt with real (finite dimensional) Lie algebras, as they arise 
while analysing Lie groups whose elements can be parametrised with an essen- 
tial and finite number of real independent coordinates. The task of trying to 
systematically classify all such Lie algebras is a very difficult and complex one, 
and it would definitely be out of place to attempt to describe here in all detail 
the results known in this area. On the other hand, it seems worthwhile at least 
introducing some of the key concepts that are used in this branch of mathemat- 
ics, and conveying the flavour of the subject. We attempt to do no more than 
this in the present Chapter. To go further, the reader is advised to study one 
of the works in the list of references given at the end. 

We describe the process of complexification of a real Lie algebra, and give 
the definition of a complex Lie algebra. We also indicate how the properties 
of solvability, semisimplicity and simplicity are used in the classification pro- 
gramme. Our final aim is to arrive at and deal with the series of Compact 
Simple Lie Algebras (CSLA), and their associated Lie groups. 


11.1 Complexification of a Real Lie Algebra 


Let £ be a real r-dimensional Lie algebra. By a straightforward procedure we 
can complexify it and arrive at a complex Lie algebra £. To begin, we complexify 
£ as a vector space. Thus the complex 7-dimensional vector space £ consists of 
formal expressions of the form, 


z=xt+iye L, 
zyeELl (11.1) 


172 Chapter 11. Complexification and Classification of Lie Algebras 


Two such expressions are the same only if the separate parts are equal: 
gtiy=a2'+iy ec=z',y=y’ (11.2) 
For a complex number a + ib, with a and 6 real, we define 
(a + ib)(z + iy) = ax — by + i(ay + bz) (11.3) 
which is again of the form (11.1). You can check that 
(a+ib)(z +iy) =O either a+ ib=0 or r+ iy =0 or both (11.4) 
Finally, then, Lie brackets in £ are defined by 
[x + iy, u + iv] = [z, ul — [y, v] + a([x, v] + [y, u]) (11.5) 


The validity of the Jacobi identity is obvious. 

We say that £ is the unique complex form of the real Lie algebra £. The 
complex dimension of £ is the same as the real dimension of £; and a basis {e;} 
for £ remains a basis for £ as well. In this basis, the structure constants of £ 
are the same as those of £, namely c},, and so they are real. 


Having seen how £ arises from £, we can now directly define what we mean 
by a complex Lie algebra. A complex Lie algebra £ of dimension r is a complex 
r-dimensional vector space, with elements z,z’,... € £, among which a Lie 
bracket is defined: 


z,z’€ LA pec; 


Azt+ pz’ EL; 
[z,2") €L; 
[z, 2'] = linear, antisymmetric; 
[[z, 2], 2”) + [[z’, 2], 2] + {[z”, z],2’] =0 (11.6) 


This idea of a complex Lie algebra is directly defined in this way, and not derived 
from a finite dimensional Lie group with real parameters. If {é}} is a basis for 
£, in general we have complex structure constants a. obeying antisymmetry 
and the Jacobi condition, 

For a general complex Lie algebra, there may be no basis in which all the 
structure constants become real! Of course an r-dimensional complex Lie alge- 
bra £ can be viewed as a real 2r-dimensional one with an additional operation 
representing “multiplication by 2”. 

Now let us mention some statements which are intuitively evident, partly 
repeating what was said carlier: 

(i) A given real r-dimensional Lie algebra £ leads to a unique complex r- 
dimensional Lie algebra L, its complex form. A basis for £ can be used as a 
basis for £, and in that case the structure constants are real and unchanged. 


11.1. Solvability, Levi’s Theorem, and Cartan’s Analysis of ... 173 
Se ey eee eer er Oe AAI YDIS OP et 


(ii) Many distinct real Lie algebras £, £’,... may lead to the same complex 
one £; in that case each of the former is called a real form of the latter, and £ 
arises by complexification of any of £, L’,.... 

(iii) A given complex r-dimensional Lie algebra £ may have none, or many, 
real forms. If it has none, then there is no basis for £ with respect to which all 
the structure constants have real values! 

While the use of complex Lie algebras is useful and indeed unavoidable for 
purposes of the general theory, the process of “returning to the real forms” is 
made quite difficult because of the above facts. As we will see later, though, 
some fundamental theorems of Weyl help simplify the situation for the real 
compact simple Lie algebras. 

The definitions of subalgebras, invariant subalgebras, factor algebras etc 
can all be given quite easily for complex Lie algebras. The relation between 
such properties for a real £ and for its complex form C is this: 


Cabelian o£ abelian; 
L solvable + £ solvable; 
Lsimple # L simple (11.7) 
We have a surprise in the last line! The “fly in the ointment” is the case 


of the groups SL(n,C); while the real Lie algebra SL(n,C) is simple, after 
complexification it is no longer simple. 


11.2 Solvability, Levi’s Theorem, and Cartan’s 
Analysis of Complex (Semi) Simple Lie 
Algebras 


It turns out that solvability is a crucial generalisation of abelianness, and it has 
a hereditary and contagious character. For both real and complex cases one 
finds solvability of a Lie algebra implies the same for all its subalgebras and for 
the factor algebras with respect to its invariant subalgebras. Even more is true: 
if a (real or complex) Lie algebra £ has an invariant subalgebra £’, and both 
£' and the factor £/L’ are solvable, then so is CL itself! 


Levi Splitting Theorem 


Any (real or complex) Lie algebra £ can be split into the semidirect sum of a 
maximal solvable invariant subalgebra S and a semisimple subalgebra T: 


L=S+)T, 
[S, S], [T, S] Cc S, 
[Reet (11.8) 


174 Chapter 11. Complexification and Classification of Lie Algebras 


Based on this result, can we classify all real or complex Lie algebras? No! 
while the semisimple ones can be classified, this has so far not been possible 
for the solvable ones. However having seen this basic theorem, and appreci- 
ated where the situation regarding classification now stands, we shall hereafter 
consider mainly the semisimple case. 

The classification of complex semisimple Lie algebras rests heavily on some 
fundamental theorems of Cartan, and work and analysis by Killing, Cartan and 
others. We give a qualitative account of these matters, staying at the complex 
level up to a certain point, then switching to the real compact simple cases. We 
begin with 


Theorem (Cartan, 1894) 


A (real or complex) semi-simple Lie algebra £ is the direct sum of mutually 
commuting simple nonabelian subalgebras: 


Lsemisimple L=L'@L"@..., L',L",... simple, nonabelian, 
lek | — ee sess = rege en =..= 0 (11.9) 


Next, as a result of the work of Killing, Cartan and others, the complez 
simple Lie algebras have all been found and classified. There are four infinite 
families and five exceptional cases. We will deal later with the compact real 
forms of these. For the present, let us see how the structure of complex simple 
Lie algebras has been analysed. The method rests largely on skillful exploitation 
of the Jacobi identities for structure constants. Incidentally we can always keep 
in mind that analysis of the adjoint representation amounts to analysis of the 
basic commutation relations and so of structure constants. 

To be reasonably well organized, let us number the statements to follow, 
and arrange them in as good a sequence as we can! 

1. Let the (complex r-dimensional) Lie algebra £ in some basis have struc- 
ture constants c7} which could be complex (we save the index | for a good 
reason); and define a “metric tensor” 


95k = 9kj = CjnCkm» (11.10) 


the sums on m and n being understood. For simplicity we avoid tildes on c and 
g, though they may be complex. A result of Cartan says: 


Lis semisimple + |9jx| = det(gj.) 4 0 (11.11) 
In one direction, the proof is very easy. If £ has an abelian invariant 
subalgebra, we can choose a basis in £ such that certain rows and columns of 


(gj) vanish identically, so 


£ not semisimple > lg] = 9, 
|g] 4 0 > L semisimple (11.12) 


11.2. Solvability, Levi’s Theorem, and Cartan’s Analysis of ... 175 


The harder thing to show is that 
L£ semisimple => {g| #0, 
\g| = 0 > L not semisimple (11.13) 
We leave this as an exercise to the interested reader. In particular, Eq.(11.11) 
gives: . 
Lsimple => |g;4| 4 0. (11.14) 
Hereafter, we assume £ is simple, so we can and will use the fact that (g) 
is nonsingular. 


2. From the structure constants cf, by lowering the superscript using the 
metric, we can construct a three-subscript symbol which is totally antisymmet- 
ric: 

Chem = GmnCry (11.15) 


Since (g) is nonsingular, we can always get back the true structure constants 
from here. The antisymmetry in the first two indices is evident. The Jacobi 
identity leads to antisymmetry in the second and third: 


Cjkm = tong Onp (a) 
= —(ChgCh; + 69;Cnn) cap 
= Cheeta: pen Stq (b) 
= ChoCnpCin — CmaCkpCin 
= (ChgCip — Cng°hp) Cen (11.16) 


Here, at the second step (a), we have used the Jacobi identity for the first two c’s 
at the previous step; and at the fourth step (b) in the second term the dummy 
indices were changed according to p > g — n — p. Thus we have established 


Cjkm = —Cjmk (11.17) 
and the total antisymmetry of cjzm follows. We will use this later on at point 


no. 7. 
3. Now choose an element A € £ and set up the “eigenvalue problem” 


(ad. A)X = [A,X] = eX, 
XeL pec. (11.18) 


One solution is of course p = 0,X = A. In any case, the number of distinct 
eigenvalues or roots can be no more than r-dimension of £. Now vary A and 
choose it so as to maximise the number of distinct roots p. Then another 
Theorem of Cartan says that for such an A, 

(a) There is a set Ro of distinct non-zero roots a, f,... 

(b) Each of these is nondegenerate, i.e., for each a € Ro, there is a unique 
E,, € £ (unique upto scalar multiplication) obeying 


a € Ro: (A, Eo] = akg. (11.19) 


176 Chapter 11. Complexification and Classification of Lie Algebras 


(c) Only the root p = 0 is degenerate, and the degree of degeneracy ! of 
this root is characteristic of £ and is called the rank of L. 

(d) The corresponding ! “eigenvectors” are elements Hg € beat ck 
they are linearly independent and obey 


(Ha, Hy] =0, a,6=1,2,...,1 (11.20) 


(e) The H, fora =1,2,...,1 and Ey for a € Ro are independent and give 
a basis for L£. 

It must be clear that a “maxima!” A for which all this happens is by no 
means unique! At this point, we see that Ro consists of r — 1 distinct nonzero 
(complex) numbers. We will soon refine and get a better description of Mo. 

4. Some immediate consequences, on exploiting the Jacobi identities, fol- 
low. Since A is a solution to (11.18) for p = 0, it is a linear combination of the 
Ha: 


A=°Ha, 
A° € C, not all identically zero (11.21) 


Then, using the three-term Jacobi identity for A, H, and E,, we get 
[A, {Ha, E.)] = a{Ha, Ea], 
i.e. ' [Ha, Ea] = aE, aeC (11.22) 


Thus, for each (complex) number a € Ko, there is a (complex) /-component 
“root vector” {aa}, and since from Eqs.(11.19),(11.21),(11.22), we get 


a= "aq, (11.23) 
each vector {a,} cannot vanish identically. In fact we can say: 


a € Ro => {aa} # 0; 
a, B E Ro, a # B > {aa} # {Ba}. (11.24) 


The set Ro consists of r — 1 distinct non-zero (complex) numbers. Let us 
now define % to be the set of the corresponding r — 1 distinct non-vanishing 
(complex) root vectors a, G,... with components aq, Ga,.... Elements of Ro 
and of ® are connected by Eq.(11.23). We are of course free to replace the 
H, by nonsingular linear combinations of themselves, leaving the E, and the 
original A unchanged. If we do this we will regard the vectors a, §,... € KR as 
not having changed, but as being resolved in a new basis. ~ 

5. The information so far collected on the structure constants Cre is this: 
each index j,k,m,... = 1,2,...,7 goes partly over a,b,... = 1,2,...,1, and 
partly over the r —/ distinct non-zero numbers a € Ro (or if we wish, a € WR). 
This is the result of using H, and E, as a basis for £. Then, 


(a) Property (3d) above > c7, = 0 (11.25) 


11.2. Solvability, Levi’s Theorem, and Cartan’s Analysis of ... 177 


(b) Property (4) above > c™, = aabma (11.26) 
(c) If a, B,a+ 8 € No, use of the Jacobi Identity on A, Ey, Eg gives 
[Eo, Eg] = No, pEat+py 


No.B = —Ne.as 
CoB = No, B5m,0+8 (11.27) 
(d) Ifa, 8 € Ro,a+ GB#0,a+ 6 ¢ Ro, then necessarily 
[Eas Eg] =0, 


It may be mentioned that {c) and (d) above are necessary consequences of 
the Jacobi identities, in advance of knowing the contents of the set Ho. 

6. Now we will show that a € Ro implies —a@ € Ro as well. In turn, and 
this is nontrivial, it will mean that a € KR implies —a € K too. Consider the 
subset of (r — 1) rows of the matrix (g;,) in which 7 = a € Ro. Indicating all 
summations explicitly, we have, using results gathered above: 


Gat = ) | Comes 
jim 


t 
= >. _ > Aa9j,0Ck; + > No,65j,0+8Cp3 + (ca,-0°K; ) 
I 


a=1 BERG 
{a+BERQ) 
t 
=~-Dloathe >> Nosthara t+ > (C,-aCky) (11.29) 
a=1 BERG 3 


(atBERg) 


In the preceding two lines we have put inside parentheses a term which is 
present only if —a € Ro, not otherwise. If indeed —a € Ro, then the Jacobi 
identity for A,B, and E_, gives 

(Za, Eo] =a°He, a €C 
hig=e ity=a, 


=0 if 7 =BERo (11.30) 


Now going back to Eq.(11.29), we can take either k = b = 1,2,...,1 or 
k = 7 € Ro. In each case we can list contributions from each of the three terms, 
assuming in the latter case that y # —a: 
k=), gob =0+0+4 (0), 
k= 74-0, gay =0+0+4+(0). (11.31) 


178 Chapter 11. Complexification and Classification of Lie Algebras 


Here, we have used Eqs.(11.25),(11.26),(11.30) when necessary. We can conclude 
that 


a€Rp,-a¢ Ro > gar =0 forallk 


=> |gixl = 0, (11.32) 
which contradicts Cartan’s theorem (11.14)! Therefore Ro necessarily consists 
of equal and opposite non-zero pairs of numbers ta, +f,...; and in the at? row 
of (9;,) the only non-zero element occurs for k = —a: 


t t 
Ja,-a = Ss; Gee oe + a NapN-a,o+8 + y, Cae as (11.33) 
a= a=) 


BERo 
(a+8ER°) 


The evaluation of the last term here involves some subtlety. It is necessary 
to prove that the /-component vector (—a)q in R, associated with —a € Ro is 
indeed —ag. To see this we start with Eq.(11.30) and apply the Jacobi identity 
for H,, Eq and E_,. 


0 = [F,, [BoE =)] 
= (Ea, [Ha, E-c]] — [Bows (Ha, Eq] 
= (—a)a[Ea, E_«] + Oa[Ea, E_.| 
= (aa + (—a)a)a”Ae (11.34) 


We will show (in point seven to follow) that {a*} cannot vanish identically, so 
we can conclude: 


Coq =(-a)a=—-Q_ no sum ona (11.35) 


Using this in Eq.(11.33), we see that 


gaa =2> a00°- Y> NopNa+p,-0 #0 (11.36) 
a BER 


(a+8€Ro) 


and the nonsingular r x r matrix (g;,) has a block diagonal form: 


Jab i 0 


=| 7 O |, 


(11.37) 


11.2. Solvability, Levi’s Theorem, and Cartan’s Analysis of ... 179 


It must then be the case that 
det(gas) # 0, 
r — 1 =even integer (11.38) 
We can get an expression for the matrix elements gap: 
Jab = 3 Ch Ch 
jk 
De De Machi 
j aERo 


= 5 a AO 
x Coalba 


aeRo 


= > omteT’ (11.39) 


aeRo 


Since (gas) is nonsingular, this means that there are enough independent vectors 
among the (r — l)vectors {a2} € R to span /-dimensional space; further, since 
these vectors come in pairs ta, it must be that 


5-2) >, 


ie., AG — 31) = integer > 0 (11.40) 


7. Let us relate the l-component quantities a® occurring in Eq.(11.30) to the 
earlier a; the former are connected with [Eo, E-a], the latter with [Ha, Ea]. 
We exploit the total antisymmetry of cjxm for this purpose: 


Ca,-a,a = ~Ca,a,-a 
i.e., Ch, aja = —Ch a9j,-a 
* b 
ie., Jab® = Ja,-aMa (11.41) 


Therefore, as already mentioned in point six above, 


\gab| # 0, Ga,-a # 0, {a} #0 => {a7} 40 (11.42) 


8. Keeping the H, unchanged, what happens if we exploit the freedom to 
change the scale of each EF, independently? Under the change, 


Eq 7 E, =NnaEo, nosumona (11.43) 
we clearly get: 
Oo = Oa, 
Jab = Jab; 
a’? = ngn_pa’, 


Gane = NaN_aGJa,—as 
Nop = NengNog/Na+pB (11.44) 


180 Chapter 11. Complexification and Classification of Lie Algebras 


If we therefore choose the numerical factors ng to achieve ga,-a = 1, then 
we get: 


Qa = gasa’, 
at = gan, 
(9°?) = (gas)~?, 
23 aca®— S> NapNatg,-a=1 (11.45) 
@ BERO 
(a+BERy) 


9. Some useful information on the Nag be obtained from the Jacobi identities. 
Suppose a, 8,a+f € Ro, and they are all distinct; then the Jacobi identity for 
Ea, &s,£_o—-pg and the fact that the vectors a, §, are independent gives: 


Nos = Ne,-a-8 = —No,-o-8 (11.46) 


10. We can now put together all the information we have gathered on the 
structure of the Lie Brackets in a complex simple Lie algebra £. The algebra is 
spanned by Ha,a = 1,2,...,l and Eg,a@ € Ko, and the brackets among them 
are: 


[Ha, Hp] =0, 
(Ha, Eq] = Oo Ee, 
[Ea, E-al = a* Ha, 
(Ea, Eg = No pEa+p,0,8,a+ 8B ER (11.47) 


These equations are called the Cartan—Weyl form for any complex simple Lie 
algebra. 

One can proceed in this way to get more information on the roots a, 8,... € 
R, their geometrical properties, the structure constants Nag etc. For instance, 
if @, 8 € R, one can ask under what conditions 6 + a,G+2a,... belong to 
as well. Such an analysis is presented for example by Racah. At this point, 
however, we will switch attention to the real simple compact Lie algebras. 


11.3. The Real Compact Simple Lie Algebras 


To descend from Cartan’s classification of complex simple Lie algebras to the 
real compact simple ones, we need to depend on some fundamental theorems: 

I: Every complex simple Lie algebra £ has bases in which all structure 
constants become real. Thus £ definitely can be obtained by a process of com- 
plexification of (several) real Lie algebras £, £’,.... 

II. (Weyl): Every complex simple Lie algebra £ has a unique compact 
simple real form £. Compactness here means that the Lie groups G associated 
with £ are compact. 


11.3. The Real Compact Simple Lie Algebras 181 


III (Weyl): If G is a (real) compact simple Lie group, and G’ is locally 
isomorphic to G, then G’ is also compact. Thus compactness of a real simple 
Lie group can be read off from its Lie algebra, which further illuminates Theorem 
II above. 

We will now state various properties of (real) compact simple Lie algebras, 
which taken together with the results already obtained in the complex case, will 
permit a complete classification of all possible CSLA’s. In the following, £ will 
denote some real compact simple Lie algebra of dimension r and rank 1. 

1. There exist bases for £ (real, of course) in which the (real) structure 
constants c¥j, become totally antisymmetric and can be written as Cjkm- This 
is because the real metric tensor gj, is positive definite and we can choose the 
basis so that it becomes the unit tensor: 


95k = Ojk 
Chk = Cjkm (11.48) 


Therefore the adjoint representation generators X; are i times real antisymmcet- 
ric matrices, so at the level of the group the adjoint representation matrices are 
real orthogonal. Incidentally the adjoint representation is irreducible because £ 
is simple. 

2. All matrix representations of £ are by hermitian matrices; in particular, 
any irreducible representation is, with no loss of generality, by finite dimensional 
hermitian matrices. These statements of hermiticity refer of course to the real 
elements of £. The Cartan—Weyl basis however has the following behaviour: 


Hi=H,, El=E_q (11.49) 


Thus, while the H, are indeed real elements of £, the E, are complex 
combinations of real elements, and in that sense, strictly speaking, they are not 
elements of £ at all. They are analogous to the raising and lowering operators 
in angular momentum theory. Nevertheless we will use them in our analysis. 

3. Since the H, are hermitian and mutually commute, in any UIR (to save 
on words we use the same description as for a group) they can all be simul- 
taneously diagonalised. They can be taken as part of a complete commuting 
set. The H, are a maximal commuting or abelian subalgebra of £; it is called 
a Cartan subalgebra. 

4. The quantities a,,a% and Nag are.all real. Now from the expression 
(11.39) for gap, we can see that by subjecting the H, to a real orthogonal 
rotation in J dimensions and then rescaling them, we can arrange for gas to 
become 5,5. We can also reduce each go,-o to unity. When these have been 
done, the Cartan—Weyl] form of the Lie brackets is 


(Ha, Ap] =0, 
[Ha, Ea] =agEa, 
he, Eel =OeHa, 
(Ea, Ea] = NapEa+e (11.50) 


182 Chapter 11. Complexification and Classification of Lie Algebras 


We will prove later that (Eq.(12.15)) 
a, 8,a+BE Ro => Nag #0 (11.51) 


5. The hermiticity property (11.49) for the complex quantities KE, has as a 
consequence the relation 
Nag = N-6,-a (11.52) 


Once we have arranged gab = 0ab; 9a.—a = 1, the relation (11.45) can then 
be simplified to read: 


i 
ae Ro :2 DNC + > No,pN-a,0+8 


a=1 BENG 
(a+BERg) 


t 
=25\(aa)?+ > NegN-a-pa by Eq.(11.52) 
a=1 


BER 


(a+BERQ) 
! 
=23)0(aa)?+ S> Nog by Eq.(11.46) 
a=l BER 
(a+BERg) 
=1 (11.53) 


We have now assembled enough factual information about the CSLA’s to per- 
mit their complete analysis and classification. This will be begun in the next 
Chapter. 


Exercises for Chapter 11 


1. The Lie algebras of SO(3) and of the three dimensional homogeneous 
Lorentz group SO(2, 1) are respectively 


[J1, Jo] = tJ3, (Jo, Ja] = ti, [J3, Ji] = ido; 
[Jo, Ki] = iKo, [Jo, Ke] = -i Ki, (Ki, Ko] = -iJo; 
Show that they have a common complex extension. 


2. For the two groups $O(3) and SO(2, 1) whose Lie algebras are given in 
problem (1) above, find r,l, Ro,a, Nag of Eq.(11.47). Assume H = J3 or 
Jo respectively. 


3. Supply the proofs of Eqs.(11.25)- (11.28). 


Chapter 12 


Geometry of Roots for 
Compact Simple Lie 
Algebras 


We have seen that in any UIR of a compact simple Lie algebra £ the hermitian 
Cartan subalgebra generators H, can be simultaneously diagonalised. A set of 
simultaneous eigenvalues for the H, can be written as an /-component real vector 
us = {ua}. Such vectors are called weights. We shall explore their properties 
in Chapter 16. But it must be clear that root vectors are the weight vectors 
appearing in the adjoint representation; and by one of the Cartan theorems, the 
non-zero roots are nondegenerate. 
’ Having scaled and arranged our generators so that 


ges = D> cay = THE yP4) — 5, (12.1) 
aeERo 


we can say that the roots of a compact simple Lie algebra of rank / are real 
vectors in an /-dimensional real Euclidean space. The set ® contains r — 1 
non-zero distinct root vectors, coming in pairs tq. As these vectors must obey 
Eq.(12.1), there must be exactly / linearly independent vectors in Kt so that, as 
we saw earlier in Eq.(11.40), $(r — 31) must be a non-negative integer. 

What arrays or geometrical arrangements of distinct pairs of vectors +a 
in [-dimensional Euclidean space can arise in the set of roots R of some CSLA 
of rank /? It turns out that there are many restrictions on such arrays, again 
on account of the Jacobi conditions. A particularly important part is played by 
a family of (in general not mutually commuting) SU(2) subalgebras, so let us 
begin by recording the structure of the SU(2) Lie algebra and its (hermitian) 
irreducible representations. 

In the Cartan—Wey] form, the elements of the SU(2) algebra are J,, J_ and 


184 Chapter 12. Geometry of Roots for Compact Simple Lie Algebras 


Jz obeying the following hermiticity and commutation properties: 


d= Js,J= J, 
(Js, J+] = Jz, 
and [J,,J_] = 23 (12.2) 


The UIR’s are labelled by a quantum number 7 which can take values 
0, 1/2, 1, 3/2, .... The j*® UIR is of dimension 27 + 1. Within this UIR, 
the spectrum of J3 is nondegenerate and consists of the string of eigenvalues 
m=j,j- 1,j—2,...,-j+1,—-j: 


J3\j,m >= mlj,m >; (12.3) 


zero occurs if j is integral and does not if it is half odd integral. The action of 


J is 
Js |j,m >= JG Em)G4m+41)|j,m+t1> (12.4) 


That is, while each |j,m > is normalised to unity, the phases can be chosen so 
that J+ act in this way. 

Let us now identify the different SU(2) algebras contained in a CSLA CL. 
There are in fact $(r — 1) distinct ones. For each root vector a € , we have 
the subset of commutation relations, 


[Ha, El = gE, 
(Ha, Ea] = —AaE_a, 
and (Eo, Ea] = OgHa (12.5) 


We can see then that for each a € No there is an SU(2)(*) algebra: 


V2 
ja et (12.6) 


Further, the hermiticity relations for these SU(2)‘~ generators are also in 
accordance with Eq.(12.2). The relation between SU(2)(@) and SU(2)‘-®) is 
very simple: 


agH, 
Su(2) : Js —) Jal?’ Js sh 


SU(2)() — SU(2)(-% : Js — —Js, Jn 9 JL, I Sy (12.7) 


This is the same as a rotation by a about the z-axis, in the usual language 
of quantum mechanics and angular momentum. 

In the sequel, we choose some a € %o,a € KR and keep it fixed. For this 
reason we have not put @ as an index on J3, J+ in Eq.(12.6). We are interested 
in the action of SU(2)(*) in the adjoint representation of £: action by Jy, J_ 
or J3 on, say, some Hg is the same as commutation of J,,J_ or J3 with Eg. 
Commuting Js with Eg should give as a result Egio provided Bia € Ro. We 
must also remember that for each 6 € No, Eg is unique apart from a scale, and 


185 


that the most general irreducible representation of SU(2) is characterised by a 
j-value and has the form in Eqs.(12.3),(12.4). 

Since all this is so, we can conclude: for any 8 € R, Eg belongs to some 
definite UIR of SU(2)(™, carrying definite values of j and m. That is, with 
respect to commutation with the SU(2)'*) generators, Eg is the m‘* component ° 
of a tensor operator of rank j. (However the relative scales of the different com- 
ponents may not yet be in accord with the matrix element values in Eq.(12.4)). 
There must be a string of roots, with p+ q +1 entries, such that 


B+pa,8+(p—1)e,...,8+2,8,8—a,.--,8—(¢- la, 
B-qaeER, B+(p+lja,8-(q+lagR (12.8) 


The point is that we are using our knowledge of all possible SU(2) represen- 
tations, and the nondegeneracy of each @ € R, to maximum advantage. Under 
commutation with J, J3 the string of (complex) generators, 


EBtpe Ep+(p-1)a: seey Ep+a; Es, Ep-a, oeey Eg-(q-1)a1 Ep-qa (12.9) 


must behave as the (not yet properly normalised) components of an SU(2)() 
tensor operator. In both Eqs.(12.8),(12.9), it is necessarily true that p > 0,q >0 
are integers; and for the moment we are assuming that 0 does not occur in the 
set of root vectors (12.8). We can easily read off the m-value for Eg, and the 
rank j for the chain (12.9): 


[Js, Ea] = (a: 8/la|”)Eg > m = a: B/a)’; 


; 4. of 1 
™yj+l=pt+qt+l>j=5(q+p),m=s(q-p) (12.10) 


Incidentally, the general structure of SU(2) representations tells us that there 
can be no gaps either in the string of root vectors (12.8) or generators (12.9). 

In ease the vector 0 occurs in the string (12.8), we cannot say that all the 
vectors in the string belong to RX. But you can easily convince yourself that the 
corresponding generator would have to be a: H/|a|?, namely J3. From here we 
can only move “up or down” one step, to E, or E_,, so that j = 1 in this case. 

What we have therefore done is to use SU(2) representation theory to anal- 
yse the geometry of the root vectors as we move up and down in [-dimensional 
Euclidean space by amounts ta, +2a, +3a,.... We find that the triplet Fig, a: 
H forms an (unnormalised) j = 1 operator under SU(2)@). Every 3 € ® other 
than ta belongs to some chain (12.8) of non-zero roots; this does not conflict 
with m = 0 being a possible value of m if 7 is an integer, since it only means 
that a and @ are perpendicular. 

Some information on the Nag can be obtained, independent of the relative 
normalisations of the Eg’s. With j and m identified as in Eq.(12.10), we do 
have 


[J-, (J+, Ep]] = G —m)G+m+ 1)Eg (12.11) 


186 Chapter 12. Geometry of Roots for Compact Simple Lie Algebras 


since 
[Ea, Eg] = NopEa+ss 
[E-a; Ea+8] = N_aa+sEg, 
and N_gats = Nags (12.12) 


We can get the magnitude of Nag: 


2 
N2, = =-p(q +1) (12.13) 


So we can definitely say 
a,8,a+BEeRo > p>1> Nog #0 (12.14) 


This property of Nag was quoted earlier in Eq.(11.51). 

The fact that m appearing in Eq.(12.10) is quantised is the origin of the se- 
vere geometrical restrictions on possible systems of root vectors. Before deriving 
those restrictions, let us record the following additional properties of roots: 


aER=> +2a,+3a,...¢ KR, 
8 = —a in Eq.(12.10) > p=2,g=0,j=—-m=1 (12.15) 
Now all that we did to Eg by commutation with the SU(2)() generators 


can be repeated with the roles of a and @ interchanged! So for any two root 
vectors a, € ® distinct or the same, SU(2)‘*) analysis tells us there must 


exist integers p,q > 0; while SU(2)) analysis tells us there must exist integers 
p',q’ > 0; and then 


1 
a: B/la\’ = 5 (9 -—p)= 5 = positive or negative half integer or zero; 
. : 1 ‘ 
a: B/\p|? = rs -p)= - = positive or negative half integer or zero. 
(12.16) 
If @ is the angle between @ and §, these restrictions mean: 
\8| n jal n! 
— cos? = ~, —cos#= —, 
la| 2° \gl 2 
i 
cos? 0 = > (12.17) 


We can draw several conclusions on the possible values of n and n’: 
n=06n'=080=90°; 
n>O0en'>0; 
n<0¢7n' <0; 
n,n’ = 0,+1,+2,+3,+44: 
O<nn' <4, (12.18) 


187 


In those cases where both n and n’ are non-zero, 


lal/|8| = Vn’/n (12.19) 


We see that the possible values of angles @ between roots, and ratios of lengths 
of roots, are severely limited. Let us see what are the allowed possibilities. 
To start, in principle the pair (n,n’) can have 17 possible values: 


(n, n') = (0, 0);+(1, 1); +(1, 2); +(1, 3); £(1, 4); 
+ (2,1); £(2, 2); +(3, 1); +(4, 1) (12.20) 


But some of these are actually ruled out! The inadmissible ones are (n,n’) = 
+(1,4) and +(4,1). For instance, because of Eq.(12.15), 


n=1,n'=4>60=03a=26¢R (12.21) 
and similarly in the other three cases. That leaves 13 possibilities. Of these, 
(n,n‘) = +(2,2) > 6 =0 or 180°, a= +6 (12.22) 


Leaving these aside as well, we are left with 11 significant possibilities, 
which we present in tabular form: 


Geometrical conditions on roots in a CSLA 


n nl 8 lol /|8| 
0 0 90° unspecified 
1 1 60° 1 
-1 -1 120° 1 
1 2 45° v2 
2 1 45° 1/V2 
=| 2 135° V2 
=) -1 135° 1//2 
1 3 30° V3 
3 1 30° 1/V3 
=i -3 150° V3 
—3 -1 150° 1/V3 


To find and classify all possible CSLA means to find all possible sets of real 
root vectors in Euclidean /-dimensional space, for various ranks /, obeying all 
these geometrical restrictions. This is made tractable by the concepts of positive 
and simple roots, to which we devote the next Chapter. 


Chapter 13 


Positive Roots, Simple 
Roots, Dynkin Diagrams 


13.1 Positive Roots 


Let us make a choice of the Cartan subalgebra generators H,,a = 1,2,...,1 
in a definite sequence (of course maintaining gah = a5), and hereafter keep 
the sequence unchanged. We shall say that a root a € & is positive if in its 
set of components a = {a1,a2,..., a1}, the first non-vanishing entry is positive. 
Otherwise, we shall say a is negative. The set of all positive roots will be denoted 
by R,, while the set of negative ones will be denoted R_. Remembering that 
every a € ® is nonzero, and that all roots come in pairs +a, it is clear that 
every a € & is definitely either positive or negative, and each of Ry and R_ 
has precisely 3(r — l) vectors: 


R= R,_UR_, 
R=setofr— distinct root vectors, 
1 
R,(R_) = set of 5" —1) distinct positive (negative) root vectors: 


@ERy, O&O -aE R_ (13.1) 


13.2 Simple Roots and their Properties 


Now we define simple roots: these are a subset of the set R, of positive roots, 
so the simple roots form a set S C Ry. A root a € Ry is called a simple root, 
a@ € S, if and only if @ cannot be expressed as a non-negative integral linear 
combination of other positive roots. We have the inclusion relations 


R= all roots D R, =all positive roots D S=all simple roots. (13.2) 


A finer subdivision of Ry will appear later on. 


190 Chapter 13. Positive Roots, Simple Roots, Dynkin Diagrams 


The properties of S are really remarkable and beautiful. We describe and 
prove them one by one. 

(i) The angles between simple roots are further restricted, beyond the re- 
strictions listed in the previous Chapter. Let a, @ be two distinct simple roots. 
Then neither a — 8 nor 8 —a@ can be a root. Otherwise, one of them would be 
in Ry and the other in R_. If a — 8 € R,, we are able to express 


a=a-Bt+H, (13.3) 


so a could not have been simple. Similarly, 6 — @ € 4 would have led to the 
conclusion that @ is not simple. We can state the result as 


Now let us interpret this in the light of the actions of the SU(2) and 
SU(2)*) algebras on 9. Using the notations of the previous Chapter, 


SU(2) applied to Eg : ¢ =0,p > 0; 
SU(2)) applied to Eq : q' =0,p' > 0; 


a-B p! 
ee —S = TK ° 
la? 2 <" ge 2 =° ee 
Referring to Eqs.(12.16)-(12.19), we have n = —p and n’ = —p’. So the angle 0 
between a and § must obey 


a,B € S,a #8 :90° < 6 < 180° (13.6) 


(We must rule out 6 = 180°, since that means 6 = ~q, which conflicts with 
both being positive roots!) The possible angles and length ratios for simple 
roots form the following subset of the table in Chapter 12: 


Geometrical conditions on simple roots in a CSLA 


n n! 6 lol /|G| 

0 0 90° unspecified 
-1 -1 - 120° 1 
-1 —2 135° V2 
2 -1 135° 1/V2 
-1 —3 150° V3 
—3 =—1 150° 1/3 


(ii) The set of simple roots is linearly independent. For, suppose there is a 


nontrivial relation 
>> tea = 0 (13.7) 
aes 


among the simple roots. Since each o@ here is in R,, there must be some ry, > 0 
and other z. < 0. (Naturally terms with x, = 0 can be ignored). Write z', 


13.2. Simple Roots and their Properties 191 


for the former, —y, for the latter, and splitting the two sets of terms, write 
Eq.(13.7) as 
7= \  z.a= S> ya #0 (13.8) 


aeS aes 


Here the simple roots occurring in the two sums are definitely distinct. One 
then has 
O<|y?= >> rhypa-B <0 (13.9) 
apes 


because each 21, y% is strictly positive, and from point (i) above a- 8 is non- 
positive. Since situation (13.9) is an impossibility, the result is proved. It is 
interesting to note that the Euclidean geometry for the space of roots was used 
in the argument, since that is what ensures 7 # 0 => |7| > 0. 

(iii) Each positive root a € Ry can be written uniquely as a linear combi- 
nation of simple roots with non-negative integer coefficients. For, let a € Ry. 
If w € S, we are done. If a ¢ S, it can be written as a positive integral combi- 
nation of other positive roots (definition of S!). Do so. If each term occurring 
involves a simple root, we are done. If not, the terms involving nonsimple roots 
can again be written as a positive integral combination of positive roots. This 
process must end with a positive integral combination of simple roots because at 
each stage all coefficients are positive integers, all vectors are positive, so there 
cannot be an indefinite or unending build-up of terms. The uniqueness of such 
an expression follows from property (ii) above of linear independence of simple 
roots. We can summarise: 


aE Ry, > a= unique non-negative integer linear combination of 
simple roots, 
ae R_ => oa = unique non-positive integer linear combination of 
simple roots 
(iv) If we now combine properties (ii) and (iii) above with the fact, noted 
at Eq.(12.1), that there are certainly enough independent vectors in to span 
i-dimensional Euclidean space, we arrive at the nice result that there are exactly 
l simple roots in S! Hereafter we write 


S= {a™ a... a} = {al} a = 1,2,...,1, (13.10) 


and the contents of point (iii) above are: 
@ER, Sa= So naa, Ma > 0, integer, unique for a; 


t 
a= SJ naa), ng <0, integer, unique for a. (13.11) 


a=l 


IR 
m 
3 
4 


192 Chapter 13. Positive Roots, Simple Roots, Dynkin Diagrams 


The point to notice is that in this way of expressing every root a € KR in 
terms of simple roots, there is never any need to deal with expressions with 
some positive and some negative coefficients! 

(v} Once the set S of simple roots is known, their length-angle relations 
already determine completely how to build up %4, and so then also R_. From 
(iv), each positive root is a unique non-negative integral linear combination 
of simple roots, so they can be ordered according to the “number of terms”. 
Introducing 


N= na, (13.12) 
a=1 
we define unique subsets of R, thus: 
N=1: S= subset of simple roots; 
N=2: R?) = positive roots with ny + n2+...+ 7, = 2; 
N=3: KR?) = positive roots with nj +net+...+n; =3; 


(13.13) 
and so on. Thus, 8%, can be broken up unambiguously as 
R, = SURF RO. URM URE TY. (13.14) 


(Of course for any given CSLA, there is only a finite number of terms here). We 
will now show how to build up mR?) from S, and then by induction iN t)) out 
(2 (N 
of SRD), ..., RW), 
In Chapter 12, we introduced the set of $(r — 1) distinct SU(2)‘@) subalge- 
bras in C, one for each pair +a of roots in ®. Now the simple roots {a‘)} in S 


have been singled out, as having a special significance. We therefore introduce 
a special notation for these | SU(2) subalgebras: 


su(2) = SU(2) for a = a',a=1,2,...,1 (13.15) 


Consider now the construction of R?), In an expression of the form (13.11), 


we cannot have any n, = 2 as the result (12.15) forbids this. So each a@ € R?) 
must be the sum of two distinct simple roots. In general, 


ae ee S.afbB=a%+0% em, (13.16) 
implies (in the same notation as in Eqs.(12.8),(12.16)), 


al). (?) 


(a) : re Sg oe 2 
SU(2)’” applied to E.) :q¢=0,p > 1, ae ~~ 9 
al).g) a! 


b : ats — 
su(2)! ) applied to Ew) [gq = 0,7’ > 1, [a2 = ~3 


13.2. Simple Roots and their Properties 193 


The vanishing of g and q’ already follows from Eq.(13.5). We therefore have the 
following simple rule to determine which pairs of distinct simple roots in S can 
be added to produce positive roots in RK?) 


a) o®) ES, aXbd: 
a a) —9 > a) 4a) gH), 
alo) <0 = al) +a € 9? (13.18) 


Any two simple roots making an angle of 120°, 135° or 150° can be added to 
give a root in 9?) 

Now we use the method of induction. Suppose (?), RO), ‘ia a) have 
been built up. How do we construct RNA We want to answer the question: 
if Be ) and a(®) € S, when is Bt al) Rint) We examine the way 9 
has been built up and ask: how often can a) be subtracted from B, leaving a 
positive root as result? We seek the value of g such that: 


B-o eR), g- 20) em... B-—gal emiY-9, 
B-(qt+ Va ER, (13.19) 


Evidently, we are examining the behaviour of 2g under SU(2)'). Because 
of the uniqueness statements in point (iv) above, especially Eq.(13.11) and the 
comment following, as also the fact that we are concerned only with the actions 
of SU(2)") and not of SU(2)™ for all a € Ro, we can be sure that we need 
not “descend below S” in the sequence (13.19). There will definitely be a value 
of g obeying 0 < g < N — 1 and satisfying (13.19). The value of the index p 
associated with this SU(2)(®) action on Eg is of course then fixed, since 


al?) . B 1 
Tate = AC —p) (13.20) 


If p found in this way is greater than zero, then we can add a) to @ and get a 


positive root in Rl 1), otherwise not: 
a N+1 
p=0:6+a ¢ xb ) 


p>0:6+a emt) (13.21) 


Thus, the knowledge of the set S of simple roots already contains complete 
information on %,, hence also R— and K. 

The simplest nontrivial example of all this is SU(3), for which r = 8,/ = 2. 
(The case of SU(2) is left as an exercise to the reader, here r = 3,1 = 1,S = Ry). 
We will define and describe the unitary unimodular groups SU(/ + 1) in some 
detail in the next Chapter, but there is no harm in exhibiting the SU(3) case as 
an illustration of the reconstruction process S — 9?) — 9°) ... Since f = 2, 


194 Chapter 13. Positive Roots, Simple Roots, Dynkin Diagrams 
et 8 RE ee ee ee 


the root vectors can all be drawn in a plane (and that is recommended as an 
exercise as well). With suitable normalisations of H; and Hg, it turns out that: 


SU(3): Ry = oo ( 4) (} =) 
1 V3 1 —V3 2 
S= {( 4) ’ (; -$)\ = {a"),a)}, say. (13.22) 


The angle between a!) and a?) is 120°, so their sum is an allowed positive 
root, giving us back (1,0) € R,. Writing 6 = a) + a), we apply SU(2)) 
and SU(2)) in turn to Eg: 

SU(2)) applied to Eg: ¢ = 1; 


a) .6 1 1 


SU(2)®) applied to Eg :q=1 


1 1 
ja? — g9-P)= 5 > P=; (13.23) 


Thus, neither ao) nor a?) can be added to to produce another positive root 


in 92)! All the higher subsets R®), R), ..., are empty and the reconstruction 
(2), 
concludes at Rj": 
Ry = {a of?) 2) 4 o(?)} (13.24) 


After we have derived the root systems for the other nontrivial rank 2 CSLA’s, 
the reader can come back and satisfy himself/herself that all this works. 


13.3. Dynkin Diagrams 


All the geometrical information about the angles and length ratios among the 
simple roots {a‘*)} = S of any CSLA C£ can be given in a two-dimensional 
diagram called the Dynkin diagram. Because of the reconstruction theorem just 
established in the previous section, we can in fact say that each possible CSLA 
£ corresponds to a possible Dynkin diagram, and vice versa. An allowed system 
of simple roots S is called a z-system, this is the same as an allowed Dynkin 
diagram. 

The rules for constructing a Dynkin diagram are the following: Each simple 
root a‘) € S is depicted by a circle ©. If af and a) are perpendicular to 
one another, the circles are left unconnected. For an angle @ = 120°, 135°, 150°, 


13.3. Dynkin Diagrams 195 


we draw one, two or three lines to connect the respective circles: 


6=90°: O 5 


= 120°: O——O 
6 = 135° : O=——=0O 
6=150°: C===0 (13.25) 


If two circles are doubly or triply connected, the corresponding simple root 
vectors have lengths in the ratio /2 or V3 respectively. This can be indicated 
by, for example, shading the circle for the longer root. We will see examples 
later. 

It is well to remember that a Dynkin diagram depicts some geometrical 
arrangement of / independent vectors in Euclidean /-dimensional space; and if 
any two circles in the diagram are unconnected, those two simple roots are at 
90° to one another. 

An important fact about Dynkin diagrams, which we now prove, is this: 
For a simple compact Lie algebra £, the Dynkin diagram as a whole must be 
connected, it cannot split into two (or more) disconnected parts. The proof is 
as follows. Suppose the set of all simple roots, S, splits into the union of two 
subsets, S() and S(), with every root in S“) perpendicular to every root in 
S‘?), For the present proof only, let us generically write a for simple roots in 
S), @ for those in S): 


S=SY US), 
a2€ SY BES: a-B=0 (13.26) 


It follows from Eq.(posrootsinr) that Rn?) consists of vectors like a!) +a‘), ch + 
Bg), but none of the form a+ f. Next, in A), can we have a combination like 
Y= a!) 4 gf) + 8? If so, it must be that a) 4 ql) € Re). Apply Ssue)(4) 
to this vector in mR?) 

clearly, g = 0, but also p = QO, since 


(a) + a). @=0 (13.27) 


So there could not have been a vector like -y above in Ro) And so on for higher 
orders. We thus see that if the simple roots S split into two disjoint sets as in 
(13.26), so do all possible roots: 
R= RO UR, 
9) = linear combinations of a’s; 
m2) = linear combinations of B's; 


aE RY, BER?) sa-B=0 (13.28) 


196 Chapter 13. Positive Roots, Simple Roots, Dynkin Diagrams 


In particular there are no roots of the form @ + @ in R. Consequently, all the 
E,.’s commute with all the Eg’s: 


aE RY, BER): [E, Es] =0 (13.29) 


We are close to the final result. We need only to check the properties of the 
Cartan subalgebra generators H,. Already we know, 


ae SO), BER): fa. H, Eg] = 0; 
BES®, ae): (8-H, Ea] =0 (13.30) 


But now, since the simple roots in S are independent and span the full space, 
the H, can be replaced by the independent combinations a-H for a € SO and 
8-H for B € S), And then the entire Lie algebra splits into the direct sum of 
two commuting subalgebras, so it is not simple! 

We have outlined the general properties of systems of roots and simple 
roots, and developed the diagrammatic method which can concisely convey all 
the essential properties of a simple root system S. The problem of complete 
classification of all CSLA’s is then clear: find all possible “allowed” Dynkin 
diagrams, i.e., 7-systems! This programme will be taken up and completed in 
Chapter 15. As an interlude, however, we look in the next Chapter at the four 
classical families of groups SO(2/), SO(2I + 1), USp(2/) and SU(I + 1): these 
are “almost all” of the possible compact simple Lie groups, there being only 
five others! Apart from providing an interlude, Chapter 14 will introduce us to 
these groups and their Lie algebras, and give us tangible examples of the general 
theory of Chapter 12 and the present Chapter. 


Exercises for Chapters 12 and 13 


1, Supply the proofs of Eqs.(12.13), (12.16)—-(12.22). 


2. For the common Lie algebra of SO(3) and SU(2), with hermitian genera- 
tors J), J2, J3 obeying 
[J5, Fx) = tejnrt, 


choosing J3 as the (single) Cartan subalgebra generator H, with a = 1: 
arrange J, and J into suitable complex combinations E4., find the set 
of all roots R = {a}, the set of positive roots Ry and of simple roots S. 
Show that in this case, Eq.(12.22) applies. 


Chapter 14 


Lie Algebras and Dynkin 
Diagrams for SO(2I), 
SO(2i + 1), USp(2l), SU(J + 1) 


Let us begin with some general remarks. For each of the four classical families of 
groups, we shall start with a defining representation, which is naturally faithful. 
Throughout this Chapter, we shall uniformly use the symbol D for defining 
representations. With the matrices of this in hand, we can find and parametrise 
elements near the identity, so read off the basic Lie bracket relations, identify 
Hf, and E,,® and Ry and S etc. In the defining representation D, as in any 
UIR, the simultaneous eigenvalue sets for the Hq are the weights p = {p.} of 
that representation. As we will see in more detail in Chapter 16, the general 
relationship between roots and weights is 


Roots ~ weights of the adjoint representation 
~ differences of weights of general representations (14.1) 


This must be kept in mind in what follows. 

For each family of groups we will adopt this sequence: defining representa- 
tion D; infinitesimal generators and commutation relations; Cartan subalgebra 
generators H,; weights y occurring in D; set of all roots R; positive roots R;; 
simple roots S; the associated Dynkin diagram. Uniformly, the index ! denotes 
the rank of the concerned group. 


14.1 The SO(2l) Family — D, of Cartan 


These are the groups of real, orthogonal, unimodular rotations in Euclidean 
spaces of even number of dimensions. For | = 1, we have rotations in a plane, 
an abelian group; for / = 2, the group SO(4) happens to have a non-simple Lie 


198 Chapter 14. Lie Algebras and Dynkin Diagrams for SO(2/), SO(2/ + 1) ... 


algebra. Therefore in discussing the family of CSLA’s in the case of SO(2/) we 
limit 1 to 1 > 3. The corresponding algebras — more precisely, their complex 
forms -— were called D; by Cartan. 

The matrices S belonging to the defining representation D of SO(2I) are 
real, 2l-dimensional, orthogonal and unimodular: 


S € SO(2l): S = 2l-dimensional, 


St=S 
STS =1 
det S = +1 (14.2) 


Each S describes a proper rotation in 2/-dimensional Euclidean space. Let 
indices A, B,C,... go over the range 1,2,...,2!. The generator matrices for D 
can be found by determining the form of an S close to the identity: 


S$ ~1—ieX,STS =1,S* =S,\e| K1> 
X*=-X,XT=-X (14.3) 


Thus, the most general X is 7 times a real antisymmetric 2/-dimensional matrix. 
We can construct a basis for such matrices quite easily: writing M4, for them, 
we define 


(Masp)cp = i(6acédgp — dapogc), A,B =1,2,..., 21 (14.4) 


Here, only C and D are row and column indices, and in them we have antisym- 
metry. However, A and B enumerate the various generator matrices, and since 
we have antisymmetry in them too, 


Map =—MeBa (14.5) 


the number of independent generators is /(2/ — 1). Thus the order of SO(2l) is 
i(2i — 1). While one might use the Map for A < B, say, as an independent set 
of generators, it is more symmetrical to use all M,g subject to the conditions 
(14.5), in general discussions. 

The commutation relations among the Mag are: 


(Mas, Mcp) =i(SacMapn — bacMapn+6apMca-—SavMcpg) (14.6) 


These equations completely fix the structure of the Lie algebra D; — all we 
need to do is expose its contents suitably! Any UIR of SO(2I) is generated 
by hermitian Mag (in a suitable complex space of suitable dimension) obeying 
these same commutations relations. 

Towards identifying the elements of the Cartan subalgebra, it is useful to 
divide the 21 values of A, B,... into | pairs: the a'® pair consists of the index 
values 2a —1, 2a; and a ranges from 1 to/. Within each pair, we can have indices 


14.1. The SO(2i) Family — D, of Cartan 199 


r,S,... going over just the values 1, 2: 


A, B,...— ar,bs,...3 
G; 005. = 1,2; ccoglirysy.ec = 123 


z 


A=2(a-1)+r (14.7) 
If we now take 


A = M2, Ho = M3a,..., Hi = Mai-1,21 
ie., Ay = Moa-1,2a = Mai,a2 (14.8) 


we do see that because none of the kronecker deltas in Eq.(14.6) “click”, these 
commute with each other: 


(Ha, He] =0,a,b = 1,2,...,1 (14.9) 


Geometrically too this is obvious: H, generates SO(2) rotations in the 1-2 
plane, Hz in the 3-4 plane, and so on. It can next be easily shown that these 
Hi, are a mazimal commuting subset of generators. If one takes a general linear 
combination X of the M,g and imposes the condition that it commute with 
each of the H,, one quickly discovers that it must be a linear combination of 
the H,’s: 


= stapMaz,[X, Ha] =0,a=1,2,...,1> 
X =210H, + 234Ho +... (14.10) 


Thus, the H, do span a Cartan subalgebra of D;, so the rank is l. 

The matrices S of the defining representation of SO(2l) act on 2/-compo- 
nent real Euclidean vectors. To find the weights of this representation D, we 
must simultaneously diagonalise all the H,, working in the complex domain if 
necessary. To clarify the situation, weights in any UIR, including the case of 
D, are l-component real vectors; the corresponding simultancous eigenvectors 
of the H, are vectors in the representation space, and so in the case of D they 
are 2l-component quantities. Since the Hg are block-diagonal with the forms 
(in the representation D!): 


ee ge |aree (14.11) 


it is quite easy to find their simultaneous eigenvalues. If H; has eigenvalue +1, 
then H2, H3,... have eigenvalues zero; when H has eigenvalue +1, Hi, H3,... 
have eigenvalues zero; and so on. Let us write e,,a = 1,2,...,/ for the unit 


200 Chapter 14. Lie Algebras and Dynkin Diagrams for SO(2?), SO(2/ + 1) ... 


vectors in /-dimensional Euclidean root and weight space: 


e, = (0,0,...,1,0...0),¢ =1,2,...,1 
T 
a‘ position (14.12) 
Then there are 2! weights {u} in the representation D, and they are: 
D of SO(2I) : {u} = {+e,, be0,---, +e} (14.13) 
Each of these weights is nondegenerate in this representation, and their number 
correctly gives the dimension of D. 

Once we have got the weights in the defining representation D, the general 
relationship (14.1) between roots and weights suggests that the roots might be 
of the forms te, + e,, for a # b, and +2e,. Which of these actually occur 
in the set K of all roots? The number of distinct roots is determined to be 
1(2i—1) —1 = 2U(l— 1). We can determine the set ® as follows. It was already 


mentioned that an independent set of generators is Mag for A < B. Let us list 
them, using the split index notation A — ar, B — bs of Eq.(14.7) as follows: 


a=b,r=1,s=2: Mai,a2 = Ha; 
a <b: Mai.bi = Xab; Mai,b2 = Yao, Ma2,01 = Zab, Maz,o2 = Was 
(14.14) 


The subset of commutation relations between H’s on the one hand, and X,Y, Z,W 
on the other, are: 


[Has Xbe] = —1(dabZbc + bacYbc), 
[Ha, Ybe] = 4(—dabWee + bacXbe)s 
[Ha; Zoe] = t(SabXbe — SacWee), 
(Ha, Wye] — 2(dabYbc + bacZbc) (14.15) 
Here it is assumed that b < c, and there is no summation on repeated indices on 
the right hand side. To find the set % of all roots, we must form combinations 


of X,Y,2Z,W which, upon commutation with each Ha, go into multiples of 
themselves. Some algebra shows that if we define, 


b<ce=tle=+la=ce,+e'e,, 
Ea = Xtc — 1€Zpe — i€' Ven — €€' Woe, (14.16) 
then, 
(Ho, Eq] = agEo (14.17) 


By choosing all possible pairs b,c obeying 6 < c; and for each pair all four 
choices of €,e’; we do get enough combinations from which all the X,Y, Z,W 
can be recovered. Thus the complete set of roots ® is: 


SO(2I) : R = {te, +e,,a < b} (14.18) 


14.2. The SO(2i + 1) Family — B; of Cartan 201 


We see that the vectors +2e, do not appear as roots; and as both a and 6b run 
from 1 to 2, the number of distinct roots is exactly 2/(1 — 1) as expected. 
The subset of positive roots is easily identified 


Ry = {e, te,,a < bd} (14.19) 


There are i(/ — 1) of them. To find which of these are simple, some analysis is 
needed. It helps to look at low values of J, and then generalise. One finds: 


S = {€1 — €2,€2 — €3;---, G2 


| 
ie 
Io 
| 
+ 
Im 
ae 


£715 €j— 


1) _ BQ). t-1) _ ty 
a) =e, — 6,0) =e, —e,,...,.a°-) =e, -e,a4% =e, +e 


Each simple root is of length /2, so the length ratios are unity. Any two 
simple roots are either orthogonal or make an angle of 120°; the non-zero scalar 
products among simple roots are the following: 


a) - of?) = a?) . a) —— alt-2) -af-}) = of?) : a) = —1 (14.21) 


From all this information the Dynkin diagram can be immediately drawn: z- 
system for SO(2/) = Dy: 


0 a) 
626-20. 40—26 
ag) gg ft3) aa a(t (14.22) 


Remember that any two unconnected circles represent mutually perpendicular 
simple roots! 


14.2 The SO(2/ + 1) Family — B, of Cartan 


The preceding analysis of SO(2?) considerably simplifies the work of similarly 
treating the proper rotation group in an odd number of dimensions, SO(2I + 
1). Now the defining representation D consists of (21 + 1)-dimensional real, 
orthogonal, unimodular matrices: 


S €SO(2i4+ 1): S = (21+ 1) dimensional 


S*=S, 
STS =1 
det S = +1 (14.23) 


These rotations act on (2i + 1)-component Euclidean vectors. Now we let the 
vector and tensor indices A, B,..., go over the range 1,2,...,2/+ 1: the range 
appropriate for SO(2l), plus one more value, namely (2/ + 1). 


202 Chapter 14. Lie Algebras and Dynkin Diagrams for SO(2!), SO(2/ + 1) ... 


By examining the form of an S close to the identity, one finds again that 
a basis for the Lie algebra in the defining representation consists of matrices 
Mas = —Mga with the same expression (14.4) for matrix elements; the only 
difference is that these are now (2/ + 1}-dimensional matrices, and the number 
of independent matrices is [(2/ +1). Thus the order of the group SO(2i + 1) 
is (21 +1). Even the commutation relations (14.6) retain their validity for the 
present group, with the ranges of A, B, C, D extended! 

One can once again use the split index notation A — ar to cover the range 
1,2,...,2/, and then separately include the value A = 2/ + 1. 

The SO(2/) choice of Cartan subalgebra generators H,,a = 1,...,/ serves as 
a Cartan subalgebra for SO(2/+1) too: even with (2/+1)-dimensional matrices, 
one easily checks that any generator matrix X commuting with H,, He,..., A; 
has to be a linear combination of them. Thus SO(2i + 1) has rank J. In the 
present defining representation D, the H, are the 2i-dimensional matrices which 
we had with SO(2l) plus one extra row and one extra column at the ends con- 
sisting entirely of zeros.. This immediately tells us the weights » in D: they are 
the same as with SO(2!), plus the weight 0: i 


D of SO(2! +1): {u} = {te;, +e,---, +e, 0} (14.24) 


The number of distinct weights is 2/ + 1, so they are all nondegenerate. 

Turning to the system of roots ®, since the H, are “unchanged” in going 
from SO(2!) to SO(2/+1), all previous roots remain valid; the SO(2/) generators 
are a subalgebra. The total number of roots is 2/7, which exceeds by 2/ the num- 
ber of roots for SO(2l). These new roots must arise from the bracket relations 
between H, and the extra generators Ma,o141. Using split index notation for 
the first index here, we find: 


[Hay Mo1,2141 — t¢Myo,2141] = €5ao(Moi,2141 — t¢Mbe,2141), 
€ = +1, nosum on b (14.25) 
Thus we have the new roots te,,b = 1,2,...1 here! So the full root system is 
SO(2i+1):R= {te, te,,a < bd; +e,} (14.26) 
The positive roots are immediately recognised: 
Ri = fe, te,,a < be,} (14.27) 


What about the simple roots? Those SO(2/) roots which were not simple 
will again be not simple, since SO(2?) C SO(2l + 1). So the new simple roots 
must be some subset of the old ones, and the new positive roots. Thus we must 
search among 


£) — 9,89 — 3,-- + Gi — Oy, ©; 4 + Gy, £1 9, - ++ 5 Ey 


and remove those that are expressible as positive integer combinations of others. 
In this way, we are able to eliminate e,_, + €),€),€9,---,€)-)- The survivors are 


14.3, The USp(2i) Family — C; of Cartan 203 


the simple roots for SO(2i + 1): 


S = {e1 ~ £2; €9 ~ €35-+ + 81-1 — Sr Gt} 
1 2 {-1 
a) =e, —e,a) =e, -e,,...,a Y=ae )-e, 


a =e (14.28) 


Notice that the geometry is different! All of a), a!?),...,a-) have length 
/2, and successive ones make an angle of 120° with one another (those two or 
more steps apart are perpendicular!). The last root a) has unit length, and 
makes an angle of 135° with a~): all this agrees with the general angle-length 
ratio restrictions of the previous Chapter. Thus we obtain the Dynkin diagram 
for SO(2I + 1): 

m-System for SO(2i + 1) = By 


a gf) gS) gl) g l= gi (14.29) 


The longer roots have been represented by shaded circles. 


14.3. The USp(2/) Family — C; of Cartan 


The family of unitary symplectic groups, defined only in complex spaces with 
an even number of dimensions, is somewhat unfamiliar to most physicists. We 
therefore analyse its Lie algebra structure in a little detail. Curiously, the groups 
USp(2/) seem to lie “mid-way between” SO(2l) and SO(2i +1), in some aspects 
resembling the former and in others the latter. 

The defining representation D of USp(2/) consists of complex 2/-dimensional 
matrices U obeying two conditions-the unitary condition, and the symplectic 
condition. To set up the latter, we have to define a “symplectic metric”. As 
with SO(21), let indices A, B,... go over 1,2,..., 21; and use split index notation 
when needed. Then the symplectic metric is the 2i-dimensional real antisym- 
metric matrix 


a ; O wee 
n= (naB) = me = ae 
—1 0 
AB = Nar,bs = SabErs, 
(€rs) = ta2 (14.30) 


(It is because 7 has to be both antisymmetric and nonsingular that we must 
have an even number of dimensions). The important properties of 7 are: 


= —-1,n7 =n =-7 (14.31) 


204 Chapter 14. Lie Algebras and Dynkin Diagrams for SO(2l), SO(2i + 1) ... 


Now the matrices U € D are defined by the two requirements 
uUltuU=1, UT =n (14.32) 
Let us examine the form of an U close to the identity: 
U~1—-ied, |e] K1:Jt=J5,J?n+nJ=0 (14.33) 
The symplectic condition on J can also be expressed as 
(nJ)? = nJ (14.34) 


We must find in a convenient way the real independent parameters in a matrix J 
obeying these conditions; that will give us the order of the group, a basis for the 
Lie algebra etc. It is useful to recognise that there is a subgroup formed by the 
direct product of independent SU(2)’s acting on the pairs 12, 34,...,21 —1, 21: 


SU(2) x SU(2) x... x SU(2) C USp(2) 
A=1,2 3,4 «.. 2-121 
ie, a=1 S sxc’ A (14.35) 


With the use of split indices, we break up J into 2 x 2 blocks as follows: 
On the main diagonal, we have J) at the a‘® position, fora = 1,2,...,1. Then 
for a # b, we have J‘) at the intersection of the a‘ pair of rows and b' pair 
of columns. Then the picture is: 


a b 
if | 
() 
(Jas) = m (14.36) 


Bee ION ce Ge) 


One can immediately verify that the blocks in Jt, J and (nJ)? are related to 
those in J in the following ways: 


(JT) = Jt, (Jt) — yet, 
(nJ) = iogT®, (nJ)) = tog so); 
((nJ)7 YO = (o2T)?, (Cg J)T)'O) = (ig JO) (14.37) 


Therefore, the conditions on J in Eq.(14.33) translate into these statements on 
the 2 x 2 blocks: 


Ji = J : Jit = J), Jiob)t = J(ba). 
nt = Symmetric : ion J = Symmetric, ign J?) a2 (109 J(2))T (14.38) 


14.3. The USp(2l) Family — C; of Cartan 205 
ES 


For each of J) and J(¢-5) we can introduce a complex linear combination of 
the 2 x 2 unit matrix and the Pauli matrices, and see how the coefficients get 
restricted. Then the result is that each J‘) is a real linear combination of the 
three Pauli matrices, symbolically, 


JI = go, + yor + 203, xyz real, a=1,2,...,l (14.39) 


Thus these J()’s are in fact the generators of the subgroup SU(2) x SU(2) 
x... x SU(2) of Eq.(14.35). Each J(**) for a < b is a pure imaginary multiple 
of the unit matrix plus a real linear combination of the o’s: 


a<b: JM) = 20,4 yout+zog+iw, 2z,y,2z,w real; 
Jie) _ Jlabyt (14.40) 


The parameters in the J‘%+) for different a < 6 pairs are of course independent. 
Counting up the 3/ independent parameters in the J(®, and the 2l(1— 1) param- 
eters in the J(*»), we see that there are in all 1(2!+ 1) independent parameters 
in J. Thus the order of USp(2/), as in the case of SO(2/ + 1), is 1(22 + 1). 

The descriptions of J() and J(@) in terms of Pauli matrices guide us in 
choosing a basis for the Lie algebra. We take 


ee =; at a" diagonal block = 0; ae: Oey Peer 
*. a1 
xo = +7-1 at ab and ba blocks = ,a<b=1,2,...,1; 
—i1 
8 
en = 0; at ab and ba blocks = ,a<bo=1,2,...,1 
OF 


(14.41) 
The following subset of commutation relations follows quite easily: 
[Xf XP] = 2iSarejnm XO; 
b<e:(X!, XP) = i(bab — bac) Xi"; 
x, st = i(des — 5ac)bjnX 0 
+ i(Sas + Sac)éjkmX*), no sums on b,c 
(14.42) 


These are all the relations involving at least one generator of SU(2) x SU(2).... 
The remaining commutation relations to complete the Lie algebra structure are 
left as an exercise. 


206 Chapter 14. Lie Algebras and Dynkin Diagrams for SO(2l), SO(2i + 1) ... 


For a Cartan subalgebra, one might guess that the set H, = xi) would do. 
It turns out that this is so-these generators of the individual U(1)’s within each 
SU(2) do form a maximal abelian set of generators. Thus the rank of USp(2/) 
is seen to be J. Fortunately in the defining representation D the H, are already 
diagonal, since H, is simply o3 in the a diagonal block. So the weights in D 
are: 

D of Usp(2!): {u} = {+e,, te,.--, ter} (14.43) 

There are 2! distinct weights, formally the same as in the SO(21) case, and each 
is nondegenerate. 

The total number of roots is order minus rank, so 2/? as for SO(2i+1). The 
a‘ factor in the product subgroup SU(2)x SU(2) x... xSU(2) already supplies 
us with the roots +2e,: these correspond to the raising and lowering operators 
of that SU(2). There must then be 2I(/ — 1) additional roots. 

These turn out to be te, +e,,a < 6, which incidentally were the complete 
set of roots for SO(21). For the roots +2e,, the E, operators are easy: 


(Ha, XO 44x) = +260(X0? +41X), no sum (14.44) 
For the rest, we find from the partial list of commutation relations (14.42): 
b<c: (Ha, xe + ieX) = €(5an - Sac)(X¢o) + ieX), no sum; 


(Ha, xe + ieX$) = €(dab + Sac)(Xe + ieX”)), no sum; 
e=+1 (14.45) 


The former therefore provide us with E.’s for roots of the form +(e, — ¢,), the 
latter for roots of the form +(e, + e,). The full list is then, 


USp(2l): R = {te, + e,,a < b;+2e,} (14.46) 
Positive roots are the subset of J? vectors 
Re =f{egte,, a<b;2e,} (14.47) 


In finding the set S of simple roots, we can drop those positive roots which 
in the case of SO(2i) were not simple. This is similar to the procedure we 
adopted for SO(2i + 1). Thus, now S is a subset of 


€1 — £2, — €35-- + Gi ~ Lys Cy-y + Gy, 2€), 2€9,..- 2G 
The vectors that get eliminated at this stage are, as in the SO(2l + 1) case, 
€y_1 + €, 2), 2€9,.--, 2€;_,, So we are left with 
S= {e; — €9,€2 — €3,-+-,€j_1 €;, 2e;}5 
a) =e, — 07) =e, -e5,...,a) =e -e 


e | 


a) = re) (14.48) 


14.4. The SU(i +1) Family — A, of Cartan 207 
kN en 2 


Notice the difference from the SO(2! + 1) case: Now the first | — 1 simple roots 
each have length /2 , the /** has length 2, so the former are shorter. T he 
USp(21) Dynkin diagram follows: 


a-system for Usp(2/) = C;: 


ay 


a) al) ag) gl) gt) PO) (14.49) 


The difference as compared to SO(2/ + 1) is that now only one circle is 
shaded: in Eq.(14.29), we have (/ — 1) shaded circles. 

It is illuminating to collect together the results of this section and the two 
previous ones, so that one can see at a glance the SO(2!)— Usp(2l)—SO(2I + 1) 
pattern (the common rank / is suppressed): 


Group Order dim.D {p}in D R Ss 
SO(2l) =D; 1(21 ke 1) al +e, == ae) fa ~ La413 
Ey + & 
USp(2l)=C, (2 +1) al a ste, +€,, f4 — fo41} 
+2e, 2e, 
SO(2l+1)=B, U2l+1) W+1 +e,,0 ste,te,, ©, —ea433 
eq e 


The ways in which USp(2/) resembles SO(2/), and those in which it is more 
like SO(2i + 1), are made evident. Of course intrinsically the symplectic groups 
are very different from the orthogonal ones, and for these the kind of intuitive 
Euclidean geometric ideas one is accustomed to are of no use. 


14.4 The SU(/+1) Family — A, of Cartan 


The algebraic work involved in analysing the defining representation D for 
SO(21), SO(2i + 1) and USp(2l), and then deducing weights, roots etc., has 
been fairly clean. While the unitary groups SU(/ + 1) are easier to define and 
picture, because of familiarity with quantum mechanics, they turn out to be in 
a numerical sense somewhat awkward when it comes to determination of roots, 
weights etc. Finally, of course, simplicity is regained in the Dynkin diagram! 

The defining representation D of SU(i + 1) consists of all complex (i + 1)- 
dimensional unitary unimodular matrices, 


UUt=H1y4,, detU =41. (14.50) 


As we know from experience in quantum mechanics, the infinitesimal gen- 
erators are all hermitian traceless (i + 1)-dimensional matrices (and it is the 
tracelessness that causes much of the ungainliness in the following expressions!). 
Since such a matrix involves l(/ + 2) independent real parameters, that is the 
order of SU(i + 1). 


208 Chapter 14. Lie Algebras and Dynkin Diagrams for SO(2l), SO(2i + 3) ... 


There is a “tensor” way of describing the generators and Lie algebra of 
SU(i+1), which we now describe but which we do not use later on. Let indices 
j,k,m,..., Tun over 1,2,...,4 +1. Introduce the following set of matrices in 
{+ 1 dimensions: 


(Ai) = oop - oer 
(A’,)t = A‘, 
Ai, =0, 
TrA’, =0 (14.51) 


Any hermitian traceless (1+ 1) dimensional matrix X, such that U ~ 1—ieX is 
an infinitesimal element of SU(! +1), can be uniquely written as a combination 
of these A’s: 


— k Ad 
X=a;A",, 


Say. a=. (14.52) 


Thus, the A’ , subject to the vanishing of A’. can be used as a (nonreal) basis 
for the Lie algebra of SU(i+ 1). Their commutation relations are quite simple: 


[A?,, A’a] = 6h A — 5A’. (14.53) 


One can then define the Lie algebra of SU(/+1) by these brackets, the hermiticity 
condition in Eq.(14.51), and the understanding that the A’, are an over complete 
basis for the Lie algebra since the sum A’, must vanish. 

While one can certainly proceed with such a scheme, we shall instead start 
afresh and identify the Cartan subalgebra, weights and roots in a different fash- 
ion. In the defining representation D, the subgroup of diagonal matrices can be 
taken as the source of the H,: thus we need a complete independent set of real 
diagonal (i + 1) dimensional traceless matrices. That these will be maximal is 
obvious. Any X commuting with all diagonal traceless matrices is itself neces- 
sarily diagonal. We therefore define the elements of the Cartan subalgebra in 


14.4, The SU(i+ 1) Family — A, of Cartan 209 


D as follows (the vanishing off diagonal elements are not indicated): 


1 
i —1 
a 0 
1 V1-2 ’ 
0 
1 
; 1 
a —2 
2 7.3 0 ’ ’ 
1 
1 
1 
A, = 1 ay 
> a(a + 1) = 
0 
1 
i 1 
A, = ——— 5 14.54 
: l(t +1) : 1 ( ) 
—l 


The rank of SU(/ + 1) is verified to be 1. These matrices are basically the 
Al,, A%,. PALLY of the tensor description suitably rearranged and recom- 
bined; the numerical factors have been chosen so that in the defining represen- 
tation D, we get 


Tr HoH = bas. (14.55) 


For dealing with the weights {4} in D and later in finding the E, combina- 
tions, it helps to use quantum mechanical ket vector notation for the /+ 1 basic 
states on which the SU(/ + 1) matrices U € D are made to act. Thus we shall 
have the kets |j) for j = 1,2,...,4+1. Since the H, in Eq.(14.54) are already 
all diagonal, the weights occurring and the corresponding eigenvectors can all 


210 Chapter 14. Lie Algebras and Dynkin Diagrams for SO(2l), SO(2i + 1) ... 
be directly read off. We give the expressions in some detail: 
H,\j) = v2 9), no sum, a = 1,2,...,5j =1,2,...,04+1; 


ees | ae : : 
c 12’ V23' V3 Ja-1' Vid +1) 


3° 734007 yt 
Cn (ee ee eee 
BO \ Jia (03 Jaa l= fied). 
FF) (aan PEO, 
=  ¥2.3' V3.4" A/G -1V J+) )’ 


PEE Ci eel ec SR 
~ V8.4 =I Vit + 1) )' 


AOS Ty Pree Ste Sp 
- meer =e Vi) ]? 


pt) = (0.00 62240, ats) (14.56) 


Jil +1) 


The weights p“),j7 = 1,2,...,4+ 1 of the defining representation D are all 
distinct and nondegenerate. 


To find the roots and Eg combinations, we take the remaining generators 
of SU(l + 1) to be: 


(Ajk)mn = 5jmbkn + 5jndkem, J#R: 
(Byk)mn = —i(djmbkn ra 5jn5km)> jH#k (14.57) 
These are of course not independent, though they are hermitian: 
Aj = Ajk = Ang 
Bi, = Byk = —Bry (14.58) 
Thus, there are $/(/+1).A’s and an equal number of B’s which are independent. 
The £.’s happen to be the complex combinations, 
1 : : 
Ejn = 3 (Aik +iBze), JK; 
(Ejx)mn = bjmokn (14.59) 


Thus, (in the representation D), Ej, has only one non-zero matrix element, at 
the j*® row k** column. Since the actions of both Hg and Ej, on the kets |j) 
are extremely simple, the roots are easily obtained: 


Holi) = #913), Ejetm) = demi) > 
(Ha, Exx) = (uw -— uw) Ej, 9 #k, no sum. (14.60) 


14.4. The SU(f+1) Family — A; of Cartan 211 


Therefore, the roots are all differences of weights of the defining representation! 
SU +1): R= {pO — p75 Ak j,k =1,2,...,14+1} (14.61) 


From here to find the positive roots, and then the simple ones, one has 
to examine in detail the weight vector components listed in Eq.(14.56). The 
positive roots are found to be the following: 


Ry =p — pO, 7 =2,3,...,4,041; 
pi) — 2, 9 = 2,3; 
— pO), 7=2,3,...,1-1; 
pF) — pO, 5 =2,3,...,1. (14.62) 


The number of vectors here is 311 +1), exactly half the number in ®. A more 
concise way of describing R, is this: 


my = (u — pO, 5 > 2; p) - pW, 7 > k > 2} (14.63) 


Out of these positive roots to pick out the simple ones is again a bit tedious, 
but by looking at small values of J, like 1 = 2,3,4,..., one quickly sees the 
pattern. The general result is: 


SH pl — yO, yO — pO-D, yO) — pO), yO — yn (14.64) 


We prefer to rearrange them in the following sequence (remember that each 
root has ! components): 


~3 
Clee eae = (Te aes eat): 
a LB 
B EZ 755° 


=4 
(2) _ (4) _ 3) G2 a 20), 
Qa 
a BL B 3° /3.4 
a = py) — yi) = het ea) ‘ 


pel 


a2) =p — po) = gh ) F 


G=G=1) f= 


ae b=)... =@-l): 
ua ed) 


af) os ph) — y pO = 


aren, Oyo 1 i+1_\, 
pr, Oe FE Ves at tai Vite 
l} 


S = {a a=1,2,..., (14.65) 


212 Chapter 14. Lie Algebras and Dynkin Diagrams for SO(21), SO(2i + 1) ... 


These algebraic complications in the case of SU(i + 1) seem pretty well un- 
avoidable; the observance of the tracefree condition is much like working with 
independent variables in the centre of mass frame of a many-particle system! 

To draw the Dynkin diagram, we need to compute the lengths and angles 
among simple roots. Fortunately at this stage things are simple: 


lo | = V2,a=1,2,...,0; 
of .of(4t) = -1, @=1,2,...,1—1, rest zero (14.66) 
Therefore, the Dynkin diagram is much simpler than in the three earlier cases: 
n-system for SU(! + 1) = A): 
Oo——O——0--.——0 O 
2g)... gD of (14.67) 


aM af 


14.5 Coincidences for low Dimensions and 
Connectedness 


We have described in some detail the defining representations of the four classical 
families of compact simple Lie groups A; = SU(i + 1), B; = SO(2l+1),C; = 
USp(2!), D; = SO(21). These representations were exploited to find the Cartan 
subalgebras, roots, positive roots, simple roots and then to draw the 7-systems. 
For small values of J, there are “chance” coincidences among the Lie algebras 
of these four families, after which they branch out in “independent directions”. 
As good a way as any to spot these coincidences is to look at the respective 
Dynkin diagrams, thus putting them to use! In this way one finds the local 
isomorphisms: 


A, ~ By ~C,: SU(2) ~ SO(3) ~ USp(2) 
Bo~ C2 : $0(5) ia USp(4) 
A3 ~ D3: SU(4) ~ SO(6) (14.68) 


Beyond these, as the diagrams show, there are no more coincidences. 

As given by their defining representations the groups A; and C; turn out 
to be simply connected, and so they are their own universal covering groups. 
On the other hand, both B,; and D; are doubly connected, so their universal 
covering groups give in each case a two-fold covering of the group specified by 
the defining representation. 


Exercises for Chapter 14 


1. From the commutation relations (14.6) in the case of SO(4) corresponding 


to | = 2, show that the Lie algebra splits into two mutually commuting 
SO(3) Lie algebras. 


14.5. Coincidences for low Dimensions and Connectedness 213 


2. For the defining representation of the group SU(3), the analogues of the 
three Pauli matrices o familiar from SU(2) are the eight GellMann matri- 
ces A, defined as follows: 


KF OOD O&O 


me 
l 
oe 
- OO 
-—- OOo ODD oOo 


Verify the commutation and anticommutation relations 
‘ 4 
[Ar As] = Qifrstrt, {rr; As} = gore + QdystAt 


where the completely antisymmetric f,., and completely symmetric drs 
have these independent nonzero values: 


fizs = 1, fay = —fise = foas = fos7 = faas = —fae7 = > 

fase = fora = 4, 

diig = do2g = d33g = —dsag = At 

diag = dis7 = —d2a7 = dose = d3a4 = dass = —d3e6 = —d377 = . 
daas = dsss = dees = d77g = = 


3. Verify explicitly at the Lie algebra level the low-dimensional coincidences 
listed in Eq-(14.68). 


Chapter 15 


Complete Classification of 
All CSLA Simple Root 
Systems 


In Chapters 12 and 13, the general geometrical properties of roots, positive 
and simple, were derived. These properties are comprehensive enough to allow 
us to determine exhaustively all possible simple root systems. The detailed 
look at A;,B;,C; and D; in the previous Chapter has given us examples of 
the general theory in action. Now we take the opposite point of view and 
use the geometrical restrictions to completely determine all allowed simple root 
systems. The question naturally is: What are the possibilities apart from those 
encountered in the last Chapter? As promised, we will be showing that there 
are only five other CSLA’s, the so-called exceptional algebras. 

Let us right away list the three basic conditions that the system S of simple 
roots of a CSLA of rank / must obey: 

(A) S consists of ! independent vectors in /-dimensional real Euclidean 
space. 

(B) The possible angles and length ratios among these vectors are 90° (free 
ratio), 120° (ratio 1), 135° (ratio 2) and 150° (ratio V3). 

(C) The system S must be indecomposable, i.e., it must not split into two 
disjoint subsets with every vector in one being perpendicular to every vector in 
the other. This connectedness of a Dynkin diagram was proved in Chapter 13. 

Throughout this Chapter, we shall frequently refer to these three conditions 
(A), (B), (C). We have called a system S satisfying (A), (B) and (C) a a-system. 
An allowed a-system or Dynkin diagram will often be given the symbol J’, and 
be referred to as a graph; a subgraph will often be denoted by I’, etc. 


216 Chapter 15. Complete Classification of All CSLA Simple Root Systems 


15.1 Series of Lemmas 


Now we present a carefully ordered sequence of lemmas which, little by little, 
narrow down the possible m-systems I". 


Lemma Tl: <A connected subgraph I’ of an allowed graph F is also allowed. For, 
properties (A) and (B) follow for I” from their validity for F'; and (C) is ensured 
for I’ by insisting that it be connected. Of course, l’ for I” is less than J for I. 


Lemma II. For | = 1, the only possible [ is O, corresponding to the group 
SU(2)= Aj; and this is certainly allowed. 
For / = 2, the possible I”’s are 


O—O: SU(3) = Ae 

O=@: USp(4) = C2 

O=M: Ge (15.1) 
Possibilities Ap and Cz are known to us; the third, the Lie algebra of the group 


Go, is also allowed. It is a group of order 14 and rank 2, the smallest of the five 
exceptional groups. We shall exhibit its roots later in this Chapter. 


Lemma III. For | = 3, the only possible I"’s are: 


O-0O-0, O-0=0 (15.2) 


corresponding to SU(4), SO(7) and USp(6). The proof is as follows. 

The three vectors a“), a2), (3) must be linearly independent. If the en- 
closed angles are 412, 03,031, we can agree to order them as 612 < 093 < 03). 
Each @ is either 90° or 120° or 135° or 150°. Now the linear independence 
condition gives this restriction: 


O12 + 823 + 013 < 360° (15.3) 


This is best seen pictorially. Suppose 6:2 = 90°; then 093, 013 > 90° (Recall that 
property (C) forbids two of the 6’s being 90°!). Whatever 0,3 may be, 


a) 


a) must lie on a cone with —a") as inner axis and semi inner angle 180° — 03. 
By moving the possible location of a) along the cone we see that the maximum 


15.1. Series of Lemmas 217 


value of 823 occurs exactly when the three vectors are coplanar, and that value is 
360° — 90° — #13. Since they cannot be coplanar, the inequality (15.3) is proved 
in the case #12 = 90°. If #12 > 90°, then 013 > O12, only a slightly different 
picture needs to be drawn, but the conclusion is the same: 


Therefore, for 1 = 3, property (A) has given us the inequality (15.3), while 
property (C) implies that at most one of the 6’s can be 90°. The possible values 
are then 

(812, 023,013) = (90°, 120°, 120°), (90°, 120°, 135°) (15.4) 


and these just correspond to the graphs (15.2). (Check that the configuration 
(120°, 120°, 120°) with graph a7 fails to obey (A)) 


Lemma IV. On combining Lemma I with Lemma III, we can say that any 
connected set of three vectors in an allowed I for ! > 4 must be of one of the 
two types shown in (15.2). That is, if we take any three connected vectors in 
I, drop all their connections to other vectors in I and “rub out” those other 
vectors, what remains must be one of the two diagrams (15.2). 


Lemma V. We notice that in the two possible | = 3 cases (15.2), a triple line 
connection nowhere appears. This is due, of course, to the inequality (15.3). 
One can now combine Lemma IV and property (C) to state: in any allowed [ 
for | > 4, the triple line connection given below 


ro © — @ rer 


can never appear. Since it is absent for 1 = 3 as well, the only allowed I 
containing a triple line is at 1 = 2 for Gz, seen in Lemma II. Hereafter, then, we 
can ignore such connections (angle = 150°) completely. 


Lemma VI. If in an allowed graph I’, we have two circles connected by a single 
line, whatever else there may be, we can shrink this line, replace the two roots 
by a simple root of the same length, and get a graph I’ which is also allowed. 
To prove this, let F look like, 


a 86 


218 Chapter 15. Complete Classification of All CSLA Simple Root Systems 


By lemma IV, there is no vector y which is connected to. both a and B. So for 
any y € I distinct from both @ and 8, we have three choices: 


(i) y-a= 7: 8=0: ¥ not connected to either a or £, 
(ii) y-a #0,7-8=0: ¥ connected to a, not to 8, 
(iii) y-@=0,7- 8 #0: 7 connected to 7, not to a. 


Now since the angle between a and is 120°, 
lol? = |B? = |a + 8)” 


If we construct I’ from I by coalescing the two circles and replacing the 
two vectors a and @ by a + @, then in each of the three situations above we 
have: 


(i) y-(@+ 8) =0 : remains unconnected to new a + £; 
(ii) y -(a +B) =7-a@ : 7 is now connected to a + fas it previously 
~ — ‘was to @, with no change in angles and 
lengths; 
(iii) Interchange @ and £ in (ii) just above. 


Let us then check the properties (A), (B), (C) for /’. Property (A) holds 
because it was true for I’; the rank however is reduced by one. Property (B) 
has just been shown, and (C) holds by the way we constructed I’. So, I” is an 
allowed graph with rank one less than for ©. 

As we will see this is indeed a very powerful result. It allows us to come 
“cascading down” from an arbitrarily large graph to a graph of low rank, then 
apply our earlier results on the allowed low order graphs. 


Lemma VII. The number of double lines in an allowed I is at most one. For 
I = 3, we have this in Lemma III. For | > 4: if an allowed F has two or more 
double lines, we can repeatedly use Lemma VI to keep on shrinking all single 
lines, at each stage obtaining an allowed I’, until we end up with an allowed 
I” with | > 3 and made up of double lines only. But this violates Lemma IV! 
So, either an allowed F has no double lines, or just one double line. 


Lemma VHI. An allowed graph F cannot have any closed loops. For, if it 
does, then by Lemma IV, the number of vectors in the loop must be more than 
three. Then by Lemma VII, all lines in this loop except possibly one are single 
lines. Now use Lemma VI to keep shrinking these single lines one by one, at 
each stage getting an allowed graph. You will end up with a loop with three 
vectors, which Lemma IV disallows! Note that after each use of Lemma VI, the 
result would have had to be an allowed graph with a loop. 

At this stage, then, we can say: an allowed IF has only single lines and 
some branch points; or one double line, remaining single lines, and some branch 
points; no loops in either case. The next two Lemmas tackle the branch points. 


15.1. Series of Lemmas 219 


Lemma IX. If an allowed graph F has the appearance 


pa]? 


Uy — 
then, (i) ies is allowed; 


(ii) I has no double lines inside itself. The proof is quite cute. By Lemma 
I, the subgraph I (including -y) is an allowed one. Now for the three vectors 
a, 8,7 in P, we have 


IR 
kes) 
Nl 


IR 
| 
MI 
Mm oO 
[2 
It 
| 
Nile 
Rr 


Therefore, 
la+ | = v2ly1, Oy .0+8 = 135°. 


For the graph I” built up above, (A) is valid because it is so for I’; (B) holds 
by the above calculation; (C) holds by the construction. We see that I” is an 
allowed graph. Then by Lemma VII, fp cannot have had any double lines! 


Lemma X. An allowed I can have at most one branch point. If it has one, 
then it cannot have any double lines. (The number of branches at the branch 
point will be controlled by Lemma XT!) 


The proof is quickest by pictures. We have three cases to consider: 


(a) If F has two or more branch points and no double lines, then 


ro VI yo Lemma 1X 0=0—0=0 


Pepeseediy Tiwiees 


Violates Lemma VII 


(b) If has two or more branch points and one double line, then 


O 
pum oN oo SAY 
repeatedly O / \ 


both violate Lemma IX 


220 Chapter 15. Complete Classification of All CSLA Simple Root Systems 


(c) If F has one branch point and one double line, then 


oO 
Lemma VI / 
—— 0=0 : Violates Lemma IX 
repeatedly \ 

O 


In all the above, no assumption was made about the number of branches 
in I at a branch point. 

So we have reached the point where we can say: in an allowed I’, the 
number of double lines and the number of branch points cannot each exceed 
one, and the sum of these two numbers also cannot exceed one! We have next 
the last Lemma. 


Lemma XI. If an allowed F has a branch point (hence no double lines) the 
number of branches there must be exactly three. For, if there were four or more 
branches, then 


O OV 
Lemma VI Lemma IX(i) 
— O=0 : 


caer o/ 


Violates Lemma IX (ii) 


These Lemmas have drastically reduced the graphs [ to be examined. Let us 
see next what the survivors are like! 


15.2 The allowed Graphs I 


We can classify the different possibilities that remain according to the number 
of branch points and double lines. There are then three situations. 
(x) No branch point, no double line: This leads to the graph, 


Pr =O—O—0---0O—0 
ae ee er eae 


which is certainly a possible graph, in fact the one for A; = SU(/+1), Eq.(14.67). 
{y) No branch point, one double line: One has the general possibility, 


v2 v2 V2 v2 a: a 
PF =0—_O—0O. --O==0 O---O—_O 
1 2 Bet) tl 42 --- +he=l 


with ;(/2) vectors before (after) the double line. We assume that the former 
have lengths V2 , the latter lengths unity, as indicated in the drawing. So to 
begin with we have 

4i2>l, b>l, l=th. 


15.2. The allowed Graphs [ 221 


Now the condition (A) of linear independence of simple roots comes into the 
act and puts limits on J; and lo, It shows that I; > 2,lo > 3 and ly > 3,lp > 2 
are both disallowed (these are overlapping regimes!). If we had 1, > 2,l2 > 3 to 
begin with, then such a I’ reduces by repeated use of Lemma VI to ly = 2, l2 = 3: 


v2 V2 1 1 1 
r’ =O —_O0——0-——_0--O 
a a gi) gf) (6) 


But condition (A) is violated because as one can easily calculate, 
lo) + 2a) + 3a) + 20) + a)? = 0. 


Therefore, along with I’, even I is disallowed. With 1, > 3,lg > 2, we would 
have reduced LF to ly = 3, lo = 2: 


v2 ¥2 v2 1 1 
r” =0O—_0—_0———0-—O 
ag?) (8) gl4) a(S) 


and now 
lo) + 2a) + 3a'9) + da) + 2a)? = 0. 


What then remains in the no branch point, one double line case? We have: 
y = 1, lo > 1: 


JF 2 & t 4 
O=—O—O---O—O: l=ih+1>2: USp(2l) =Gi, 


ae Eq.(14.49) 
ly _ 2, lo =1: 
v2 V2 1 
O—O—=0O : 1=3:SO(7) = Bs, Eq.(14.29) 
ly = 2, le = 2: 
v2 V2 1 «#1 
O—-O—O—O: |= 4: Exceptional group F4, see below 
> 3,le =1: 
v2 v2 v2 V2 1 


O—O.--0 —O==0: l=, +12>4:SO(2l+1) = B, 
=a Eq.(14.29) 


222 Chapter 15. Complete Classification of All CSLA Simple Root Systems 


Therefore the no branch point-one double line case has given us USp(2!) 
for | > 2, SO(21 +1) for 1 > 3, and Fy. 
(z) One branch point, no double line: Leaving out the circle at the vertex let 
the three arms have 1/1, lo, /3 circles respectively, with 


h>b>lg>l, bl=h+bk413+12>4 


So to begin we have the graph 


Our experience in situation (y) previously discussed suggests that if the 
arms are “excessively long”, property (A) of linear independence of simple roots 
may fail. Indeed this is so. It happens that: 


P(ly > 5, lo > 23) > (5,2, 1) = 


a 
eee a ee 


la) + 2a?) + 30°) + dal) + 5a) + 60'S) + 3a + 4al® 4 209]? = 0, 
Lemma 
P(l, > 3,le > 3,13) —p £'(3,3,1) = 


(0) o® oe ZNO 


a (5) 


la) + 2a) + 3a) 4 dal) 4 20) + 30) 4 20 + 9)? = 0, 
Lemma 
P(h 2 2,ly > ly > 2) =» I"(2,2,2) = 


15.2. The allowed Graphs I 223 


Ja) + 2a!) + 3a) + 2a) + a) 4 20) 4 a)? = 0, 


After this large scale elimination of possibilities, the residue is the following: 


1, Ilo Ig m-system for 
>5 1 lL t=,+3: SO(2l) = D, for! >8 
4 2 1 t=8: Exceptional group Eg (see below) 
4 1 1 l=7: $O(14)=D, 
3 2 1 J=T: Exceptional group £7 (see below) 
3 1 1 L=6: SO(12)=De 
2 2 1 Jl=6: Exceptional group Eg (see below) 
2 1 1 J=5:; SO(10) =Ds 
1 1 1 J=4: SO(8) =D, 


Thus in this category we have obtained, at first in a torrent but then in a 
trickle, the 7-systems of SO(2/) = D, for 1 > 4; and then the three exceptional 
groups E¢, £7, Eg. 

We can collect and display all allowed Dynkin diagrams, hence also all 
possible CSLA’s, in one master table. In the first three columns we list the 
number of branch points, double lines, triple lines, and in the last column give 
the diagrams and the names of the groups. 


224 Chapter 15. Complete Classification of All CSLA Simple Root Systems 


Branch Double Triple Name and Diagram 


Points Line Line 
0 0 1 Go O==0 
V2V2 1 
0 1 0 B, = SO(2i + 1), O--- O—O0=0 
l>3 
1 1 V2 
C; = USp(2I), O--- O—O=0 
1>2 
F4 O—0=0—0O 
O 
1 0 0 D, = SO(2I), wee 0-6 
[>4 No 
E,,l = 6,7,8 
t 67162 
0 0 0 A, = SU(i+ 1), O—O:.-O—O 
i> 


15.3. The Exceptional Groups 


In comparison with the results known to us from Chapter 14, the new possible 
CSLA’s found now are the groups Go, F4, Eg, Fy, Fg — these are in a sense the 
surprises. We will not enter into a detailed discussion of any of them, but only 
gather some information, at first in tabular form, then say something about the 
root system for each. 


Group Rank Order Dimension Close relative 
of smallest UIR of 
G2 2 14 7 Bz = 8O(7) 
Fy 4 52 26 Bg = SO(9) 
Es 6 78 27, 27* As = SU(6) 
E7 7 133 56 Az = SU(8) 
Es 8 248 248 Dg = SO(16) 


Their root systems happen to be the following (where e’s are unit vectors 
in Euclidean spaces): 


15.3. The Exceptional Groups 225 


Go: Start with the 6 roots of SU(3): +e, (1/2)(+e, + V3e,); to these, add 
on 


+V3e9; VB 4 Vey + e), making 12 in all. 


F,: Start with the 32 roots of SO(9): +e,,+e, + e,a,6 = 1,...,4: to 
these, add on 


1 
5 (Fer +e, +e, +e,), making 48 in all. 


Eg: Start with the 30 roots of SU(6) in the (Racah) form e, — ¢,a,b = 
1...6, using unit vectors in R®; extend R® to R’ by including e7; then add on 
the roots 


+/2e,, +22 + ere +e, te, +e, te, +e) 
2 

with three plus and three minus signs; get 72 in all. 

£7: Start with the 56 roots of SU(8) in the (Racah) form e, — ¢,,a,b = 
1...8, using unit vectors in R8; then add on $(+e, +e€,...+ eg), with four + 
and four — signs; get 126 in all. 

Eg: Start with the 112 roots of SO(16): +e, + e,,a,5 = 1...8; then add 
on $(+e, + eo... +g), even number of plus signs; get 240 roots in all. 

The distribution of simple roots of various lengths in all CSLA can be 
conveniently taken as follows: 


Group Length 1 Length /2 Length V3 
SU(i+ 1) = A, - l ss 
SO(2l + 1) = B, 1 i-1 - 
USp(2l) = C; t-1 1 - 
SO(2/) = D, - l - 

G2 1 - 

F, 2 2 - 

E,,t = 6,7,8 - l - 


Chapter 16 


Representations of Compact 
Simple Lie Algebras 


In earlier Chapters we have dealt with certain specific UIR’s of the groups 
of interest to us, namely the adjoint representation, and for each of the four 
classical families a defining representation. Now we shall gather and describe 
some aspects of a generic UIR, to be denoted D, of any CSLA. We assume D 
acts on a complex linear vector space V of some dimension. The aim will be 
to convey an overall appreciation of structure within a UIR, as well as patterns 
holding for the set of all UIR’s of CSLA’s, without entering into detailed proofs 
of all statements. 


16.1 Weights and Multiplicities 


We have some compact simple Lie group G with Lie algebra £, spanned by the 
Cartan-Weyl basis H,, Hq. Since the H, are commuting hermitian operators 
on V, let us simultaneously diagonalise them and denote a common eigenvector 
by |p-s+) 

Ha|t..-) = Hel...),@=1,2,...,1 (16.1) 


We assume |. . .) is normalised to unity. Here p is a real [-component vector 
with components jg being the eigenvalues of Hz. We have already introduced 
the term weight vector for ys: it is a vector in the same /-dimensional Euclidean 
space in which roots lie. We can construct a basis for V with vectors |p...). 
Vectors |...) with different weights y are clearly linearly independent. C 

While the H, do commute pairwise, they may not be a complete commuting 
set. This may even depend on the particular UIR D. So we have put dots after 
jin the ket vectors |v...) € V: for a given y, there may be several independent 
simultaneous eigenkets. This is the problem of multiplicity. A nondegenerate 
yt, ie., a weight for which there is just one eigenket |), is said to be a simple 
weight (of course for the UIR D under consideration). Thus, one of Cartan’s 


228 Chapter 16. Representations of Compact Simple Lie Algebras 


theorems can be expressed as saying that the non-zero weights of the adjoint 
representation are all simple. 


Given the UIR D acting on V, we will often denote the set of all weights 
occurring by W: here no attention is paid to the possible multiplicity of eigen- 
vectors in V for each p € W. 


16.2 Actions of E, and SU(2)‘ — the Weyl 
Group 


How do the nonhermitian generators Eq affect a ket lz. ..)? The H, — Eq 
commutation rule tells us that in general E, changes the weight in a definite 
way — it shifts it by a definite amount: 


A,EQ|p.-.) = [Has Balle...) + LaHalp.--) 
= (ut a)aEal|u...) (16.2) 


Therefore either Zq|...) vanishes, or it is another vector in V with weight 
+a. In the latter case, the multiplicities of » and j + @ could be different. 
We write symbolically 


Fale...) =(...)|e+a...). (16.3) 


Thus the relationship (14.1) is easily understood. 
Now take a root a € &, its associated group SU(2)'@, and analyse a weight 
#. € W with respect to it. Recall that the SU(2)'@) generators are 


Js a-H Bow? 


= lal? ’ => lat" (16.4) 


The ket |...) is an eigenket of Js: 


J3|u...) =mlp...), 
m =a: p/\o|? (16.5) 


From our knowledge of UIR’s of SU(2), we see that m is quantized and 
must have one of the values 0,+1/2,+1,.... More precisely, 


aER pEeW = 2a-p/lo\* =0, +1, +2,... (16.6) 


Again from the structure of SU(2) UIR’s, we can see by applying Ei. repeatedly 
to |u...) that there must be integers p,q > 0 such that 


H+pa, wt+(p—Il)e, ..., w+a, wy, w-a@, ..., p-(q—-lhaw—qaeW, 
B+ (p+l)a, p-(q+DagW. (16.7) 


16.2. Actions of FE, and sSu(2) — the Weyl Group 229 


(You can see the similarity to the root analysis in Chapter 12). In particular, 
there can be no gaps in this sequence of weights in W. Now we must consider 
the question of multiplicities. 

Consider the set of ket vectors in V with the above string of weights, each 
occurring a certain number of times: 


\e+pa,...), “+ (—1)ay...), ---) utes); 
Iitr---}s are eds eee I — Gay.) (16.8) 


They span some subspace of V which is clearly invariant under action by 
SU(2)(. So there must be present here a certain set of UIR’s of SU(2)() 
(i.e., j — values), each occurring a certain number of times. The string of Js 
eigenvalues involved is evidently, 


mM+p,m+p-1, ..., m+1, m, m—-1, ..., m—4a, (16.9) 


where m was determined in Eq.(16.5). The fact that the chain (16.8) terminates 
at the indicated ends means that 


Eq|u+ pa,...) = E_-a|u—49a,...) =0 (16.10) 


Evidently in the spectrum of j-values present in the SU(2)(*) UIR on this string 
of states, we have 


jmax =m +p=-—(m-—4q), 
jmax = (1/2)(g+p), m= (1/2)(q-p) (16.11) 


The UIR with 7 = jmax is present as often as the multiplicity of |u + pa...), 
which must be the same as for |u — ga...),. The UIR with 7 = jmax — 1 is 
present as often as the difference of the multiplicities of |u + (p — 1)a...), and 
|\u+pa...), and so on. [Throughout we are considering SU(2)'@ action on just 
the set of vectors (16.8)]. If we wish, we could arrange to diagonalise E_aEa 
on these states, which will then explicitly reveal the spectrum and multiplicity 
of j-values. 

A characteristic feature of UIR’s of SU(2) is that if the eigenvalue m of J3 
occurs, then so does —™; in fact within SU(2) representations, the corresponding 
vectors are related by a 180° rotation about the y-axis: 


e'™/2|5,m) ~ |j, -m) (16.12) 


Applying this to the present case of SU(2)‘), we see that if p€ Wanda eR, 
then 
p- 2aa-p/\o\? © W (16.13) 


with the same multiplicity as for 4. This is because 


exp(inJS)(set of states |...)) = set of states | — 2aa- u/la?, ...) (16.14) 


230 Chapter 16. Representations of Compact Simple Lie Algebras 
LOU 2. Se Bo 2 SNE ee eee 


and exp(in JS”) is evidently unitary. 

We can summarise these results of applying SU(2)'® to » € W: 

(i) there exist integers p,q > 0 such that Eq.(16.7) holds; 
(ii) 2u- a/lal? = q —p =0, +1, +2,...; 
(iii) p — 2aa- u/|a|? € W, with the same multiplicity as y. 

The reflection operation exp(im JS) is an important and useful one. For 
each a € ®, one such reflection is defined. In weight space, this is clearly 
reflection through a plane perpendicular to @ passing through the origin. The 
set of all such reflections is a group, the Weyl group; this is a finite subgroup of 
G. If » € W, any element of the Weyl group applied to 4 gives some wew 
with the same multiplicity as y. We say y, ue € W are equivalent, if they are 
related by an element of the Weyl group. So equivalent weights have the same 
multiplicity, and 

Set of weights equivalent to 


LEW = {u- 2aa- p/\a\?|o € HR}. (16.15) 


16.3. Dominant Weights, Highest Weight of a 
UIR 


We are working with a definite choice and sequence of the operators H, in the 
Cartan subalgebra. We then say: a weight y is higher than a weight y’' if in the 
difference yz — y’ (which may not be a weight!) the first non-zero component is 
positive. So for any two distinct weights, one is definitely higher than the other. 

The highest weight in a set of equivalent weights, i.c., a set related by Weyl 
reflections, is called a dominant weight. 

Given the UIR D acting on V, and the set W of all weights that occur, 
denote by A € W the highest weight. Then it is a simple weight with no 
multiplicity or degeneracy; there is just one ket |A). We indicate the proof later 
on. Clearly, if A is the highest weight, it means that if a, is any positive root, 
A+ q is not present in W: 


aceR, >A+ag¢W,E,|A) =0 (16.16) 

Even more economically stated, this is the same as 
a) ES: Eyw|A)=0,a~=1,2,...,1 (16.17) 
So for a general compact simple Lie group G, the set of “raising operators” 
Eo) for a € S generalises the single J, of SU(2). The highest weight 
state |A) in V is simultaneously the maximum magnetic quantum number state 


(m = j) for the SU(2)(*) algebras for all a € R,. This means: 


@ER,: 2A-a/la|\? = integer > 0 (16.18) 


16.3. Dominant Weights, Highest Weight of a UIR 231 


We shall introduce the notation 


a&VES: 2A-a!/\al)|? = N, >0, 
A+r— {Na} (16.19) 


It turns out that each UIR can be uniquely characterised by its highest weight 
A, or by giving the nonnegative integers Na. 

Let us indicate why the highest weight state |A) in a UIR is nondegenerate. 
Clearly we can go from |A) to other (basis) kets in V by action with Ey for 
a € R_. Now suppose there are two highest weight states |A,1) and |A, 2), 
mutually orthogonal. With no loss of generality: 


(A, 2|A,1) = 0 (16.20) 
Then for any a € R_, the states E,|A,1) and F.|A, 2) will also be orthogonal: 


OER > —aE Ry: (A, 2|EY EQ|A, 1) = (A, 2|E-aEalA, 1) 
= (A, 2|[Ba; E.\lA; 1) 
=-a-A(A,2|A,1)=0 (16.21) 


One can continue this argument for chains like EgE,|A,1 or 2), and finds 
similar results. So if the highest weight state has degeneracy, the whole space 
Y splits into two or more orthogonal subspaces, unconnected by the generators, 
hence the representation must be reducible. Conversely, in a UIR the highest 
weight is multiplicity free. It also is true that two UIR’s with the same highest 
weight are equivalent. 

Now let us return to the problem of classifying UIR’s. It turns out that 
there is one UIR (up to unitary equivalence) for each given set of nonnegative 
integers {N.} defined in Eq.(16.19), and conversely: 


Nz > 0,a =1,2,...,1 + unique UIR {Nz} of G, highest weight A: 
2A- a /\oh)|? = Ny (16.22) 


Can we explicitly exhibit this A once {N,} are given? We simply use the fact 
that the set of simple roots S = {a‘*)} is linearly independent and forms a 
basis for root and weight space. The “problem” is that we have an oblique 
non-orthonormal basis. Nevertheless, linear independence of the a) allows us 
to expand A as 


t 
A=S> dea"?, re real (16.23) 


a=l 


The linear relations between A, and N, then are 


1 
Na = 92 2(a® - a!) /Ja))?) - Ap (16.24) 
b=1 


232 Chapter 16. Representations of Compact Simple Lie Algebras 


This can and must be inverted. Let us define the Cartan matriz A, a real! xl 
matrix, as 
A = (Ags); Aad = 2c!) - ox) /|or)|? (16.25) 


It has extremely simple elements. Recalling that the a'® are simple roots, we 
have the values: 


Aga = 2, 
a#b: Ag=0 for a@).a)=0, ie, O O, 
=—1 for 0-0 
=-—lor —2 for O=O 
=-lor —3 for O=O (16.26) 


Even though this matrix is not symmetric, the simple values of its entries make 
it easy to work with. If we wish, we can express A as the product of a symmetric 
and a diagonal matrix: 


A= SD, 
Sab = Sta = a!) - o(), 
Dob = 25ab/ lok |? (16.27) 


Linear independence of the simple roots means that A is nonsingular, but A7} 
is again not symmetric. On the other hand, the matrix 


G=A'D"=D"'s"p"! (16.28) 


is symmetric. We shall express the solution for the highest weight A in terms of 
the set of integers {N.} by using this matrix G. The equation to be inverted, 
Eq.(16.24), reads in matrix form: 


N= A?) (16.29) 


so the solution is 
d= (A™)"1N = (GD)? N = DGN (16.30) 


Writing out the individual components and putting them into Eq.(16.23), 


A= J) 2GabNo a /[a)/? (16.31) 
a,b 


This way of writing A in terms of {No} leads to the next important step. 
We said that any choice of the N’s leads to a unique UIR and vice versa. We 
can then consider the { special UIR’s obtained by setting all N, but one zero, 


16.4. Fundamental UIR’s, Survey of all UIR’s 233 


and taking the non-zero one to be unity. In this way certain specia] highest 
weights must arise. We write them as 


A® = 37> 265.0 a? 
b 
A=S°> NA (16.32) 
a 


We use these particular highest weights in the following section. 


16.4 Fundamental UIR’s, Survey of all UIR’s 


Cartan has shown that for any CSLA, there are | basic or fundamental UIR’s 
out of which all others can be built via symmetrised direct products. We denote 
the fundamental UIR’s by D@), a = 1,2,...,1: 


D() = a” fundamental UIR with highest weight A‘ 
= UIR with N, = 1,N,=O0 forb#a (16.33) 


In this sense, these A‘*) are the simplest highest weights: they are called 
fundamental dominant weights. Any dominant weight (i.e., the highest in an 
equivalence class under the Weyl group) is a non-negative integral combination 
of the A). The highest weight of a UIR is the linear combination (16.32). 

The UIR uniquely determined by {N,} can be indicated by use of the 
Dynkin diagram for the group. On this diagram we mark the positions of the 
first, second, ..., i‘ simple roots a), a(?),...a. Then just below or at the 
corresponding circles we write the values of Ni, No,...,N,. In this depiction, 
PD has unity at the at® root position, zero elsewhere. 

How is the UIR D = {N,} obtained from the fundamental UIR’s D‘)? 
The prescription is to take Nj products of D™), No of D®@),..., N; of D®, take 
the direct product of them all, and isolate the highest weight in this product. 
This highest weight is evidently given by Eq.(16.32). Thus on reduction of this 
direct product representation, the “largest piece” is the UIR {N.} we are after: 


DY x 2.x DO x D@ x. DA x. DOM x... DO 


— N,—> — No— — N— 


A=N,AM + NA® 4...N,AO — UIR{No} (16.34) 


The fundamental dominant weights A) determining the fundamental UIR’s 
out of which all UIR’s can be built, have a simple geometrical relationship to 
the simple roots S. Since each A) corresponds to No = 1, Np = 0 for b # a, 
we see that 

2A) - a) /a ? = Sar (16.35) 


We therefore set up the following procedure. Starting with the simple roots 
a), we rescale them to define the linearly independent (but not normalised!) 


234 Chapter 16. Representations of Compact Simple Lie Algebras 


vectors 
4) = 2a) fal ? (16.36) 
Then the fundamental dominant weights form the reciprocal basis in root space 
to this set of vectors 
A. GO) = dap (16.37) 


Incidentally, these vectors 4°) are related to the symmetric matrix G ap- 
pearing in Eqs.(16.28),(16.30),(16.31),(16.32): 


G"} = DSD = (&) . a); 
G = (&) . &l))-1 (16.38) 


The integers {N,} label UIR’s uniquely, and they tell us graphically how to 
- build up general UIR’s from the fundamental ones. Are there invariant operators 
such that the N, or functions thereof are eigenvalues of these operators? There 
are, namely the Casimir operators which are (symmetric) polynomials formed 
out of the generators X;. For example, the simplest quadratic one is 


Go = g*X;Xt 


I 
=S0H?+ 30 Boe (16.39) 
a=l aEeR 
Then there are cubic and higher order expressions, in fact precisely / independent 
Casimir operators. However we will not pursue their properties in any detail. 

For a CSLA of rank /, when is the UIR {No} real or potentially real? When 
is it essentially complex? Just as a UIR has a unique simple highest weight A, 
it also has a lowest simple weight vy. Then the UIR {N,} is equivalent to its 
complex conjugate if and only if A = —v. More generally, if A and vy are the 
highest and lowest weights of the UIR D, then —y and —A are the highest and 
lowest weights of the UIR D*. 

From this general criterion we see immediately how it is that each UIR of 
SU(2) is self conjugate: the maximum and minimum values of m,j and —j, are 
negatives of one another. 

We have seen that the complete list of compact simple Lie groups is SU(n) 
for n > 2; SO(n) for n > 7; Usp(2n) for n > 2; Go; F4; En for n = 6,7,8. Which 
of these possess complex UIR’s? A detailed analysis shows that only SU(n) for 
n > 3, SO(4n+4 2) for n > 2 and Eg have some complex UIR’s; in all other cases 
each UIR is real or pseudo real. 


16.5 Fundamental UIR’s for A;, B,,C;, D; 


The simple roots for each of these groups have been given in Chapter 14. In 
principle we can then calculate the fundamental dominant weights as the basis 
reciprocal to {a}. The calculations are quite easy for B;,C; and D;, and a 
bit involved for A;. We will take them up in the same sequence D;, Bi, Cz, A; in 
which we dealt with them in Chapter 14. 


16.5. Fundamental UIR’s for A:, B:, Ci, D: 235 


Case of D, = SO(2I): 


With e, being the standard unit vectors in Euclidean /-dimensional space, we 
have from Eq.(14.20) the following set of simple roots: 


a) =e) —e9, a) =e,~e5, ..., al Y =e, 2, al =e, +e, (16.40) 
Since each [a] = V2, the rescaling (16.36) has no effect: 
a) = Gl) (16.41) 


So the fundamental dominant weights AY, A®), ... form a system recipro- 
cal to the set (16.40). At this point let us remember from Eq.(14.13) that in 
the defining vector representation D of SO(2l), the highest weight is e,. In fact, 
the set of all weights in this representation can be arranged in decreasing order 
thus: 


Wp = weights in defining representation D of SO(2!) 


= {1 €05+++)€1) Ens +++) —Lay En} (16.42) 


We will use this information in interpreting the fundamental representations. 

Let us now list the fundamental dominant weights A). they are easy to 
find from the condition that they form a basis reciprocal to the simple roots 
(16.40): 


1 
AV = 5 (en +o +... +1 — &1)s 
1 . 
Ao = 3 (e1 +eo+.--+8)-1+&) (16.43) 


The “half integral” structure of the last two must come as somewhat of 
a surprise! What can we say about the corresponding fundamental UIR’s 
DM Di), ...,D4? Comparing with the weights Wp of the defining repre- 
sentation D in Eq.(16.42), we can easily see that 


D") = defining vector representation D 
D?) — representation on second rank antisymmetric 
tensors, (D x D)antisymmetric 
D3) — representation on third rank antisymmetric 
tensors, (D x D X D)antisymmetric 
D''-2) —representation on (J — 2) rank antisymmetric 
tensors. (16.44) 


236 Chapter 16. Representations of Compact Simple Lie Algebras 


The truth of these statements would be evident to you by using arguments 
similar to those in quantum mechanics when enumerating the possible states 
of identical fermions obeying the exclusion principle. So all the fundamental 
UIR’s except the last two really arise out of the defining representation D, and 
its products with itself. The last two, however, can never be obtained by any 
finite algebraic procedure starting with D; this is evident on comparing At) 
and AM with Wp! These last two “unexpected” fundamental UIR’s of SO(2/) 
are called spinor representations. 


Dp“) — first spinor representation, dimension 2'!, also 
written A“). 
D” = second spinor representation, dimension 2'—!, 


also written A(?) (16.45) 


We shall study the construction and properties of these spinor UIR’s in the 
next Chapter. At this point, it suffices to repeat that the | fundamental UIR’s 
of SO(2/) are the antisymmetric tensor representations of ranks 1,2,...,! — 2, 
and the two (inequivalent) spinor representations A"), A(2), 

Associated with the group SO(2I) is an antisymmetric Levi-Civita symbol 
carrying 2/ indices. So one knows that among antisymmetric tensors there is 
no need to go beyond those of rank J. One naturally asks: where are the 
antisymmetric tensors of ranks ! — 1 and J, and what status do they have? They 
are of course not in the list of fundamental UIR’s. It turns out that the (/—1)*> 
rank antisymmetric tensors furnish a UIR with the highest weight 
A=e,+&, +...+6€) 4 

AED 4 A (16.46) 


Thus, in the {N,} notation, this is the UIR (0,0,...,1,1), and it occurs in 
the direct product A) x A) of the two spinor UIR’s. As for the [‘" rank 
antisymmetric tensors, they are reducible into self dual and antiself dual types, 
and depending on one’s definitions, one is present in {A x A()} symm and 
the other in {A() x A) }symm. We shall deal with these in some detail in the 
next Chapter. 


Case of B; = SO(2i + 1): 
The set of simple roots in this case is, from Eq.(14.28), 

al) =e ~ &, a = ee, ..., a Y =e, ,~-e, a =e, (16.47) 
After rescaling according to Eq.(16.36), we get the vectors 4(2): 


A =e, ~@9, A?) ees, ..., dV =e ,-6, A =2e, (16.48) 


16.5. Fundamental UIR’s for Aj, Bi, Ci, Dy 237 


Now the set of weights in the defining representation, arranged from the highest 
down to the lowest, are: 


Wp = weights in defining representation D of SO(2i + 1) 


= {€1,€9,..-, 2,0, -&,--., 22, —€1} (16.49) 


We now have the requisite information to construct the fundamental dom- 
inant weights and interpret the corresponding fundamental UIR’s: the former 
are 


AG) = €1; A®) =e6; + 0, A®) = €) + €9 + €3,.655 
AW) Se, tent... +e-4 
+ 


1 
AQ) = Sey ae; sg 


aes e,) (16.50) 


We are thus able to recognise the first (1 — 1) fundamental UIR’s: 


DY) = defining vector representation D 


D) = representation on second rank antisymmetric 
tensors, (D x D)antisymmetric 


D\-)) — representation on (J — 1)** rank antisymmetric 
tensors (16.51) 


In contrast to SO(2l), at the end of this list there is now only one spinor 
UIR: 


DY = unique spinor representation, dimension gt 
written A (16.52) 


This spinor representation can, of course, not be obtained from D by any finite 
algebraic procedure. We study it in detail in Chapter 17. 

For SO(2! + 1), the independent antisymmetric tensors are those of ranks 
1,2,...,/—1,2. All but the last have appeared in the list of fundamental UIR’s. 
The rank J antisymmetric tensors do not yield a fundamental UIR. The highest 
weight for their representation is clearly the sum of the first / weights in D listed 
in Eq.(16.49): 


A=e,tegt... te = 2A” (16.53) 


Therefore, in the {N.} notation they furnish the UIR (0,0,...,0, 2), and this is 
the leading UIR in the symmetric part of the product A x A. 


238 Chapter 16. Representations of Compact Simple Lie Algebras 


Case of C; = USp(21) 


Interestingly, the simple roots and the scaled simple roots now are just inter- 
changed as compared to B; =SQ(2i+ 1): from Eq.(14.48), 

a) = &1 — Ea: a) = op — £31 - +5 af) = 1-1 — Er a) = 21; 

g) = €1 — &2) &) = a — €3r ores a“) = €-1 — En a) = & (16.54) 
The weights of the defining representation given in Eq.(14.43) are the same as 
with SO(2l): 

Wop = {€), €a,-- +12 —24s Gras +++ Ex} (16.55) 
The reciprocal basis to &‘*) is very easily found. There are no half integers 
anywhere now: 
A™ =e, A?) =e, teste, 
AC) =e, +ept e+e, 
AM =e, +f+:- +e (16.56) 
There is a uniform pattern now all the way up to the last one. We can say 
that the fundamental UIR’s D™),a = 1,2,...,1, are given by the ath rank 
antisymmetric tensors over the defining vector UIR D, except that in this sym- 
plectic case it is the antisymmetric objects that must have traces removed! The 
trace operation of course involves contraction with the symplectic metric 74s 
of Eq.(14.30). So the fundamental UIR’s for USp(2/) are 
a=1,2,...,0: Dp) = at rank antisymmetric traceless tensors 
over defining vector representation D (16.57) 


There are no spinors for the symplectic groups. 


Case of A; = SU(/ +1) 


We have noticed in Chapter 14 that the calculations of weights and roots for 
the groups SU(/ + 1) tend to be slightly messy compared to the other families! 
The simple roots are listed in Eq.(14.65). Each of them has length /2, so we 
have no change on rescaling. We write them a bit more compactly as 


Cee) ene eeT Fy 
ai®? = Gi?) = ae, — Va+ 2e,,,),@=1,2,...,1—1, 
cans Va + Ben 41) 
l 
a = 4 = Sp ee = (16.58) 


re aC +1) 


Next, in listing the weights of the defining representation D, given in 
Kq.(14.56), it is helpful to define the following unnormalised multiples of the 


unit vectors e, 
f,=2o/Va(a+1), a=1,2,...,1 (16.59) 


16.5. Fundamental UIR’s for Ai, Bi,C;:, Dt 239 


Then the highest weight in D, followed by the successively lower weights are: 
LD eg oe a meray i 
pit) sd -If,; 
(@) _ ape : 
Bee ie ( YF iy + £3 
w= -C-2)F, thi th 


B®) dg td Peat par Te 
(16.60) 


Now we must construct the J fundamental dominant weights. First we recognise 
the role of the defining UIR D. Its highest weight pO) is seen to obey 


p® . a) = 0, a=1,2,...,J-1; 
40 21 (16.61) 


It 


Thus according to the general definition, this is the /th fundamental UIR: 


YAO D=dDM (16.62) 


To find the other fundamental dominant weights, it is helpful to expand a 
general vector in the form 


t 
z= 2 f= 2 ee,/Va(a + 1) _ (16.63) 


so that products with the simple roots simplify: 


- &') = (tq — 2a41)/(a +), a=1,2,...,4—1,; 


Is 


t-1 
2G =~ 2,/a(a+ 1) + *. (16.64) 


Using such expansions, one can quickly discover the reciprocal basis to the 4): 


A = f, tf,tectl, ~Ae F oe L) 


= Yes ViGF1 —a > e,//o(b+1), a@=1,2,...,1-1 (16.65) 


b=a+1 


The /th fundamental dominate weight AM = p) can also be taken to be given 
by this same general expression. These A‘) can now be expressed in terms of 


240 Chapter 16. Representations of Compact Simple Lie Algebras 


the successive weights of D listed in Eq.(16.60), namely, beginning with (16.62): 


AM = pO), 
(t-1) _ pO) + pty; 


> 


A-2) = po + pt) + pl, 


AM = p) 4 pt) oe pl Besta he pS) (16.66) 


These expressions show that the | fundamental UIR’s of A; = SU(i + 1) are 
the defining representation D, and the antisymmetric tensors over D of ranks 
2,3,...,/ as one might have expected: 


a=1,2,...: D{*) =antisymmetric tensors of rank (1 + 1 — a) over D 
(16.67) 


As with the symplectic groups, there are no spinors for the unitary groups. 

One might be tempted to rework the algebra for the A; family of groups 
to avoid the rather “peculiar” patterns for A‘), weights y in D, etc. that we 
have found. However, that is bound to introduce complications elsewhere, for 
instance in the choice of the Cartan subalgebra generators H,. We have made 
what seems like a simple and uncomplicated choice of H, to begin with, and 
then gone through the general theory in a systematic way. 


16.6 The Elementary UIR’s 


The | fundamental UIR’s of a rank ! group G are the building blocks for all 
UIR’s in the sense that the general UIR {N,} is the “largest” piece in the direct 
product of N; factors D“), Ng factors D!),.... It is isolated by working with 
the highest weight in this product, 


A{N} = NAM 4+ .NA® +..., (16.68) 


evidently lying in the product of the completely symmetric parts of {@D™)}™, 
{@D®)}A2,. 

But we have seen that for A;, B;, C; and D;, many of the fundamenta] UIR’s 
are antisymmetric tensors over the defining UIR D! Thus, if we have the defining 
UIR in hand, and may be one or two more, then by the combined processes of 
antisymmetrization and symmetrization of products, all the other basic UIR’s 
and then all UIR’s can be formed. The number of elementary UIR’s needed for 
this construction is just 2 or 3. The precise definition is that they correspond 
to the “ends” of the Dynkin diagram. So in the sequence D;, B;,C;, Ar, we have 


16.7. Structure of States within a UIR 241 


the elementary UIR’s: 


D, =SO(2l) - Three elementary UIR’s: 
{N.} = {1,0,...} — vector UIR, D; 
{Na} = {0,0,...,1,0} 3 spinor A™ 
{Na} = {0,...,0,1} — spinor A®) 


B, =SO(2! + 1) - Two elementary UIR’s: 
{Na} = {1,0,...,0} — vector UIR, D; 
{Nu} = {0,0,...,0,1} — spinor A 


C; =USp(2l) - Two elementary UIR’s : 
{Na} = {1,0,...,0} — defining UIR D; 
{Na} = {0,...,0,1} — ith rank antisymmetric “traceless” tensors over D. 


A, =SU(I + 1) - Two elementary UIR’s: 
{Na} = {0,0,...,0,1} — defining “vector” UIR D; 
{No} = {1,0,...,0} — ith rank antisymmetric tensors over D. 


However, in the C; and A; cases, it makes good sense to say that there is 
only one elementary UIR, and that is the defining UIR D! 


16.7 Structure of States within a UIR 


We have seen how the UIR’s of a compact simple Lie algebra can as a family be 
classified, and also how they can be built up starting from either fundamental 
or elementary UIR’s. We worked out all the relevant details for the classical 
nonexceptional families of groups. Now let us briefly look “inside” a generic 
UIR D, to get a feeling for the kinds of problems and structures involved. 

Given a CSLA € of rank | and order r; what is the dimension of the UIR 
{N.}? Suffice it to say that there do exist explicit formulas due to Weyl, both 
for the nonexceptional and the exceptional cases. The former expressions are 
given in Salam’s lectures, the latter are included in the book by Wybourne, 
besides many other places. 

Let W be the set of all weights occurring in D, A the highest weight, and 
# a gencral one. While A is simple, in general y is not: this is the multiplicity 
problem. Je the most general UIR {Na}, it turns out that a basis vector in 
VY needs 3(r — 31) additional independent state labels or quantum numbers 
to Supplement the 1 weight vector components ya which are the eigenvalues 
of the diagonal generators H,,a = 1,2,...,/. Thus, in general, a complete 
commuting set within a UIR consists of $(r — 1) operators, | of them being 
generators, and the remainder (in principle) functions of generators (but of 
course not Casimir operators). In some particular UIR’s, it could happen that 


242 Chapter 16. Representations of Compact Simple Lie Algebras 


there is no multiplicity problem — for instance this was the case in all the defining 
UIR’s. 

For A; = SU(i +1) and for SO(n) comprising both B; = SO(2i + 1) and 
D, = SO(2l), there are natural or “canonical” ways of choosing these $(r — 3) 
additional diagonal operators: they are based on the fact that if a UIR of 
SU(i + 1) (respectively SO(n)) is reduced with respect to the subgroup SU(I) x 
U(l) [respectively SO(n — 1)j, each UIR of the latter which occurs does so just 
once. So one can use the Casimir operators (i.e., basically the UIR labels {V}) 
of the chain of sub groups SU(i/ + 1) D SU(l) D> SU(I — 1)... D SU(2) in 
the unitary case, and of SO(n) D SO(n — 1) D SO(n — 2)... D SO(3) in the 
orthogonal case, to solve the multiplicity and state-labelling problem. 

Apart from this, what can one say in general terms about the properties 
of a weight pp € W? Clearly, by the application of the SU(2)(@) and SU(2)‘*) 
subalgebras, _ 


pEW= 2p-a /laX? = p-& =n, =0,41,42,... (16.69) 


In other words, all the states |u...) for a given ~ € W are simultaneous 
(a) 
eigenstates of the J operators ge ) 


. Given the integers {na}, is uniquely 
determined just as A was by {Ng}: 


u 
AMO =n eop=> nA (16.70) 


a=1 


But for a given UIR {N.}, what p and so what {n} arise? 
The general answer is as follows. Every 4 € W is obtained from A by 


subtracting a unique non-negative linear combination of the simple roots af) 
with integer coefficients: 


A= highest weight, u< W => 
1 
p=A- > yea), ve unique, integer, >0. (16.71) 


a=1 


The uniqueness follows from the linear independence of the a*). These 
v, are related to the nq (find the relation!). Now knowing A, how do we find 
out what can be subtracted to get allowed weights present in D? There is a 
recursive process: we express W as the union of subsets, 


W=WO VWO UW) U... (16.72) 


16.7. Structure of States within a UIR 243 


Here, 


WI) = A = simple highest weight; 
l 
W = {wEW|S “va =1}; 


a=1 


t 
Ww) = {ue W|S~ ve = 2}; (16.73) 
a=l1 


So W"*) is called the kth layer of weights — every » € W'*) is k simple roots 
away from A! oe 

How do we find W@)? In other words: which a@) can be subtracted from 
A? Evidently, with respect to each SU(2)'*) we see from Eq.(16.19) that A is 
the “maximum m” state: 


A under SU(2)™ : m= jo = (1/2)Na- (16.74) 


Therefore Nz > 1 is the necessary and sufficient condition for A — a!) to 
be present: 
Na>leA-a& ew cw (16.75) 


In general, let » € W*), and a'*) € S. The question is: does the next layer 
WE+) contain 4 — a)? The answer is given by an application of SU(2) to 
}, as one would of course expect. With respect to SU(2)‘), » corresponds to a 
Magnetic quantum number, 


1 
# under SU(2) :m= Be al /Iq |? = gre (16.76) 


We now sce: if the contents of W , WO), ...,W'*) are known; if 
LE we) i tal € wl), pt 2a!) € Whk-2) 1, 
u+pa ew), n+ (p+ la ¢w, 


and if 
m>-—(m+p) or m>-—p/2, 


i.e.: m is not yet the least possible value of the magnetic quantum number, then 
pal) © wt) 
More formally one can express the situation thus: 
BE we), pt pal? EW for p=1,2,...,p, 
pt(pt+ Do EW, p- 4 +p >0> 
p— al © wit) (16.77) 


244 Chapter 16. Representations of Compact Simple Lie Algebras 


In principle, then, once A is chosen, and knowing S, all the weights present in 
D are known. 

The level of a weight yz is the same as the “layer number” k above, namely 
the “number of simple roots” to be subtracted from A to get to it. This is of 
course not the same as the multiplicity of y, though all the weights of a given 
layer or at a given level do have the same multiplicity. This multiplicity is given 
by a formula due to Freudenthal (quoted for instance in the book by Wybourne). 
The “highest layer” is the one with maximum k, and this maximum value is 
denoted by T(A). It is also called the “height” of the UIR. The behaviour of 
multiplicities shows a pyramidal or, better, “spindle-shaped” structure. Multi- 
plicity at level k is the same as at level T(A) — k. The highest layer WO has a 
simple weight A, so multiplicity one; then the multiplicity keeps increasing for 
a while as we come down to lower layers, reaching a maximum at k = 4T(A). 
After that it keeps decreasing again, until for the lowest weight v it becomes 
unity again. Exercise: Construct the simple roots S and the weight and multi- 
plicity diagrams for the UIR’s (1 0), (0, 1), (1, 1), (2, 0), (0, 2) of SU(3). In the 
process many of the general results described above become clear. 


Exercises for Chapter 16 


1. For the orthogonal groups D,; = SO(2/), B; = SO(2l + 1), prove that the 
generators belong to the second rank antisymmetric tensor representations 
D), and that this is the adjoint representation. 


2. Work out, in the SU(3) case, the simple roots S, and the weight and 
multiplicity diagrams for the UIR’s (1, 0), (0,1), (1, 1), (2, 0) and (0, 2). 


3. Is the adjoint representation (1, 1) of SU(3) a faithful representation? If 
not, why not? 


Chapter 17 


Spinor Representations for 
Real Orthogonal Groups 


We have seen in the previous Chapter that among the fundamental UIR’s for 
the groups D; = SO(2/) and B; = SO(2i+ 1) there are some “unusual” rep- 
resentations which cannot be obtained by any finite algebraic means from the 
familiar defining vector representations of these groups. These are the spinor 
representations. For D; we saw that there are two inequivalent spinor UIR’s, 
which we denoted by A") and A(); their descriptions as fundamental UIR’s, 
and their highest weights, were found to be (see Eqs.(16.43),(16.45)): 


DE) = AY = {0,...,0,1,0} » AUD = (eh +éo +... +61 —©&) 
= (1/2,1/2,...,1/2,-1/2); 
DY = A@) = {0,...,0,1} : AO = 5(e1 + eo t+... +€)-7+&) 
= (1/2,1/2,...,1/2,1/2) (17.1) 
For B,, there is only one spinor UIR, which we wrote as A (see Eqs.(16.50,16.52)): 


DY = A= {0,...,0,1}: A = 5(e + £9 +... +) 
= (1/2,1/2,...,1/2) (17.2) 


As part of the general Cartan theory of UIR’s and fundamental UIR’s of 
compact simple Lie groups and algebras, these spinor representations more or 
less fall out automatically, once the root systems and simple roots for D; and B, 
are understood. At the same level one also understands and accepts that for the 
unitary and the symplectic groups there is nothing analogous to spinors. But 
from the point of view of physical applications, spinors are unusually interesting 
and useful quantities, and it is therefore appropriate that we understand them 
in some detail from a somewhat different starting point. In doing so, of course, 
we can be guided by the general theory outlined in the previous Chapters. 


246 Chapter 17. Spinor Representations for Real Orthogonal Groups 


In the succeeding sections, we shall deal at first with the spinor UIR’s 
for D, = SO(2l), and later for B; = SO(2!1+ 1). In each case we also clarify 
some of the relationships between spinors and antisymmetric tensors which were 
briefly alluded to in the previous Chapter. The entire treatment is based on the 
representations of the Dirac algebra, many aspects of which were first worked out 
by Brauer and Weyl. It is possible, and probably most elegant, to base the entire 
analysis on an unspecified representation of the Dirac algebra, i.e.,to develop 
everything in a representation independent way. However, taking inspiration 
from Feynman, we shall do all our calculations in a specific representation, and 
believe there is no sense of defeat in doing so! 

The SO(2/) and SO(21+1) treatments in this Chapter will form a basis for 
studying spinors for the pseudo-orthogonal groups SO(p, g) in the next Chapter. 


17.1 The Dirac Algebra in Even Dimensions 


The introduction of spinors in the context of the Lorentz group, based on the 
Dirac anticommutation relations which arise in the relativistic electron equation 
is quite well-known. We start from a similar algebra for arriving at spinors with 
respect to SO(2I). 

Consider the problem of finding hermitian matrices y, of suitable dimension 
obeying the algebraic relations 


{ya, YB} = YAYB + YBYA = 26aB, 
A,B =1,2,...,2l (17.3) 


The range of indices is appropriate to SO(2I), and as in the previous dis- 
cussions of this group, we will use split index notation when necessary. In fact, 
in such notation, Eq.(17.3) appears as 


{Yer Yes} = 26ob6rs; 0,6 = 1,2,...,1;7,8 = 1,2. (17.4) 


It is an important fact that, upto unitary equivalence, there is only one 
irreducible hermitian set y4 obeying these relations, and the dimension is 2!. 
The proof is given, for instance, in Brauer and Weyl. We will give an explicit 
construction, but it is instructive to recognise immediately the “source” of the 
spinor UIR’s of SO(2/) — it lies in the essential uniqueness of the solution to 
(17.3). For, if S is any matrix in the defining representation D of SO(2I), ie., 
a real orthogonal unimodular 2!-dimensional matrix, then 


Ya = SBays (17.5) 


is also a solution to Eq.(17.3), if 74 is. By the uniqueness statement, there exists 
then a unitary matrix U(S) — determined up to a phase factor — which connects 


“/, to Ya, and as is easily seen, which again up to a phase gives a representation 
of SO(21): 


S €SO(2l): U(S)y4U(S)7? = Says (a) 
U(S')U(S) = (phase)U(S’S) (b) (17.6) 


17.1. The Dirac Algebra in Even Dimensions 247 


The U(S) contain the spinorial UIR’s of SO(2/), and this is their origin. 
We will construct their infinitesimal generators in the sequel. 

To construct a solution to (17.3) in a convenient form, we take / independent 
(ie. mutually commuting) sets of Pauli matrices gc j =1,2,3,a=1,2,...,2. 
Since each Pauli triplet needs a two dimensional vector space to act upon, the 
entire set is defined on a vector space V of dimension 2'. The construction of 
‘a is best expressed in split index notation as 


YA = Yer = of) aed oe “EN a 


r=1,2,a=1,2,..., (17.7) 


Notice that ya, contains no contributions (factors) from ¢), o(),..., ¢(@-)) 
(However the 2 x 2 unit matrices in those slots are not indicated but left im- 
plicit). 

The hermiticity of y4 is obvious from that of the o’s. The irreducibility is 
also quite easy to prove: we will be showing in a later section that each a) can 
be expressed as a product of two of the ’s (in fact this is already evident from 


Eq.(17.7)); from this fact we then see immediately that o{?) and o{?) can also 
be expressed as a product of -y’s. Since this is so, and since the complete set of 
all ! Pauli triplets is certainly irreducible, the irreducibility of the y, follows. 
Checking that the Dirac algebra is satisfied is easy; it is best to look at a < b 
and a = 6 separately: 


a<b: Yartes = 08) oft) ...02 Maso! 
Yos Yer = on ae es ag aw of” = 
YarYos = ~YbsYars 
a=b: Yartas = ool => 
{Yar Yas} = 26rs, nO sum on a (17.8) 


For uniqueness up to unitary equivalence, the reader may refer to Brauer and 
Weyl. 

All our further calculations will be based on the above specific 2! dimen- 
sional solution for y4 acting on VY of dimension 2'. In some contexts it is 
necessary to pass from the Dirac algebra for one value of J to that for the next 
value 1+ 1. For this purpose we may note that 


Sal AR 1,2 yee, 2h 


41 (t+1 
Wek +2 = 7% a (17.9) 


Here for clarity we have indicated as a superscript on y the relevant “I-value”. 


248 Chapter 17. Spinor Representations for Real Orthogonal Groups. 


17.2 Generators, Weights and Reducibility of 
U(S) — the spinor UIR’s of D,; 


The unitary operators U(S) acting on Y are not completely defined by Eq.{17.6). 
To the same extent, the corresponding generators Af,z are also known only upto 
additive constants. It turns out that both U(S) and Maz can be so chosen as 
to achieve at most a sign ambiguity in the representation condition in Eq.(17.6) 
- so we have here a “two-valued” representation of SO(2I). 

The generators of U(S) are 


Map = Stra, 72] (17.10) 


These are hermitian, reducible (as we shall soon see), and obey the commutation 
relations (14.6) as a consequence of the Dirac anticommutation relations (17.3). 
In checking this, one finds that the numerical factor 4 is essential. In addition to 
the D,; commutation relations, between M,z and yc one has the useful relation 


[Mas, Ic] = i(6ac7a — Sacre) (17.11) 

This is in fact the infinitesimal form of Eq.(17.6a). 
For later purposes we will need the analogue to Eq.(17.9), a relation between 
the Mag’s generating SO(2l), and the next higher set of M,p’s generating 


SO(2(i + 1)). If we once again indicate the [-value as a superscript, we have, 
using Eq.(17.9) and leaving implicit the unit matrices in relevant subspaces: 


A, B= 1,2,...,2 
Mig” = Me 


(41 1 (wy _(i+1), 
Men, = Loft; 
d+1) — 1 w_wn 
Ma 2142 = +5782} 
41 1 (4a 
Mar vate = —soy* (17.12) 


Going back to the Mag for a given fixed J, we can identify the Cartan sub- 
algebra generators H,: 


a 1 

Ha = Mat,o2 = 5Ya1"Io2 = 50590 = ae | (17.13) 
Therefore, in the usual representation of Pauli matrices, the H, are already 
simultaneously diagonal, and we immediately have an orthonormal basis for V 
given by vectors with definite nondegenerate weights. We use €, for the two 


possible eigenvalues of each H,, so we describe this basis for Y in this way: 
H{e}) = [{e1, €2,---,€¢}),€a = £1; 
Hel{e}) = 5ealfe}) 
os” \{e}) = ~€al {e}) (17.14) 


17.2. Generators, Weights and Reducibility of U(S)— the spinor UIR’s of D; 249 


The weight vector yz: associated with |{e}) is evidently 


1 
p= 5 lene + €2€9 +... + €1€;) (17.15) 


So the complete set of weights occurring in Y is 
1 
g (ter ten +--+ e;) 


and each being simple, the total number 2! is precisely the dimension of V. 
Now we come to the question of the reducibility of the unitary representa- 


tion U(S) of D;. Symbolically the relation between these finite unitary operators 
and the generators is 


4 
U(S) ~exp (-SesoMan) 
WAB = — Wea = real parameters. (17.16) 


The operators U(S) happen to be reducible because there exists a nontrivial 
matrix yr which, by virtue of anticommuting with each y4, commutes with the 
generators Mag. All this happens because we are dealing with an even number 
of dimensions, 2/. (In analogy with the properties of the matrix 75 in the Dirac 
electron theory, we write F' as a subscript on yr, to remind us of FIVE). We 
define 


vp =i nye... 
= (-1)'6 Mo?) oM, (17.17) 
and immediately verify all the following basic properties: 


h=h=vp= 77 =1F (a) 


{ya, F} = 0, 

(Maz, yr] =0, 

U(S)yr = yr U(S); (b) 

yrl{e}) = (1 «) l{e}) (c) (17.18) 


Since yp is already diagonal, we call the construction (17.7) a Wey] repre- 
sentation of the 7’s. 


We can split V into the direct sum of two 2'—! dimensional subspaces V1, V2 
corresponding to yr = +1: 


V= VY @ Va, 
yr = —lon ¥,,4+1 on Ve (17.19) 


250 Chapter 17. Spinor Representations for Real Orthogonal Groups 


Clearly, a basis for V) consists of all |{e}) with an odd number of e’s taking the 
value —1, while for V2 we use |{e}) with an even number of ¢’s being —1. The 
unitary representation U(S) of D;, restricted to V, then gives us the UIR AY) 
with the highest weight being 


1 
At-4) = gir tee te tenn —¢) (17.20) 


We see here the effect of having to include at least one minus sign among the 
e’s. A spinor (an element of Y,) transforming according to this spinor UIR of 
SO(2I) is called a left handed spinor. On the other hand, on V2 the U(S) furnish 
us with the other spinor UIR A‘) characterised by the highest weight 


1 
A = 5(e, + en +++ +4) (17.21) 


Elements of V2 are then called right handed spinors. We can express this reduc- 
tion of the representation U(.S) of Dy symbolically as: 


U(-) = reducible spinor representation of D; on V generated by Mag 
= lefthanded spinor UIR Aon V, 
® righthanded spinor UIR A®) on V2 (17.22) 


While the Mp and U(S) are thus reducible the ‘yg are of course irreducible 
on Y: in fact since they each anticommute with yr, the only non-zero matrix 
elements of y4 are those connecting V; to V2 and vice versa! 


17.3 Conjugation Properties of Spinor UIR’s of 
D, 


Each of the representations U(-), A“) and A) is unitary, ie., self adjoint. We 
would like to know how they each behave under complex conjugation. Here the 
pattern keeps changing systematically from one /-value to the next, which is 
why we assembled the Eqs.(17.9),(17.12). 

We ask if the reducible representation U(-) is equivalent to its complex 
conjugate, i.e., whether there is a matrix C' acting on V such that 


U(s)* = cu(s)c7}, 

ie, (U(S)7)} =CU(S)C"!, 
ie, —MI,=CMasC™} (17.23) 
It turns out that such a matrix C does exist, and it can be constructed 
recursively from one value of / to the next. Of course since U(-) is reducible 


C will not be unique. We will construct one possible C matrix in a convenient 
form, see how its properties change with I, and also examine its properties with 


17.3. Conjugation Properties of Spinor UIR’s of D; 251 


respect to yr. This last will then disclose the conjugation behaviours of the 
UIR’s A@) and A). 

Since the building up of C varies from | to 1+1, momentarily let us reinstate 
the /-value as a superscript on all relevant matrices and factors. The desired 


behaviour of M®), under conjugation by C\ suggests that we ask for the a) 
to transform as follows: 


CO (CO)! = a(y? 7a = +1 (17.24) 


For either value of €;, we do get Me, transforming correctly. Suppose C 
has been found; let us see if C+) can be taken as C times a suitable 2 x 2 
matrix in the space of the additional Pauli triplet a+): 


Cl) = COAG) | 
A“+1) — function of ot) (17.25) 
The requirement 
COD MEY (CG D)-1 = (MEY )T, A,B =1,2,...,2042, (17.26) 
given Eqs.(17.9),(17.12),(17.24),(17.25), leads to conditions on A+) as 


(a) A,B < 21: no conditions; 

(b) A<2i,B= +1: ACU GED & eight) An, 

(c) A<21,B=21+2: AG+Y GG) = —ealt) AG, 
(d) A=241,B = 242: AGM GY — ght) A), 


(17.27) 
These conditions fix A+!) depending on €: 
e=ti: AGH) = io ft) 
g=-1: AG) = ot?) (17.28) 


(we have opted to make A+?) real). But then what about e741 in 
CHV YED (CGD) = ey (QTY)? A= 1,2,...,2142? (17.29) 
Now we discover: 


(a) A<2l: €41=—-€ 
(b) A=22+1: AC Mol = galt) AD 
(c) A= 242: AYE = 6 oh VAUD (17.30) 


252 Chapter 17. Spinor Representations for Real Orthogonal Groups 


These requirements are consistent with the previous ones, so this recursive 
method of constructing C“) has succeeded. Collecting the results, we have 
co = 4AM 4 4 
AW) = io$), Al?) = io”), A®) = io®), ae 
€, = (—-1)! (17.31) 
We repeat that Eq.(17.23) alone would leave C“) non-unique since the repre- 
sentation U(-) is reducible. The additional natural requirement (17.24), and the 
recursive method of construction has led to a possible and convenient choice. 
It is now an easy matter to check the following additional properties of C™: 
(CM)T = (CM)-! = (-1)4D2C0, 
(c)* = ch 
COyp = (-1)'ypo (17.32) 
The behaviours of A“) and A) under conjugation can now be read off. 
For even |, each of A"!) and A‘) is self conjugate; whether they are potentially 
real or only pseudo-real then depends on the symmetry or antisymmetry of C. 


For odd J, A“) and A) are mutually conjugate. We can thus draw up the 
following conjugation table for the spinor UIR’s of D;: 


AC) 


17.4 Remarks on Antisymmetric Tensors 
Under D; = SO(21/) 


In Section 16.5 we saw that all but two of the antisymmetric tensor represen- 
tations of D; = SO(2/) occur in the list of fundamental UIR’s. However the 
antisymmetric tensors of ranks ! — 1 and J do not appear in such a role. We 
devote this section to a brief discussion of such tensors, and a clarification of 
the products of spinor UIR’s in which they occur. 

An antisymmetric tensor of rank m, say, is an m-index object Ti, A2...Am 
completely antisymmetric in the indices, which transforms under S € SO(2I) in 


17.4. Remarks on Antisymmetric Tensors Under D; = SO(2I) 253 


the following manner: 


TA, Ag... Am = SA, Bi) S.4,B2 ++» SAm Bm By Bo...Bm (17.33) 


Since for SO(2/) we have an invariant Levi-Civita tensor €4, A,...Az, (normalised 
to €1 2 ...28 = +1), we need to consider only antisymmetric tensors of ranks 
1,2,...,!-—1,l. Of these, those of ranks 1,2,...,/ — 2 furnish the first (J — 2) 
fundamental UIR’s of the group. 

We have already seen in Section 16.5 that the highest weight in the repre- 
sentation given by antisymmetric tensors of rank (J — 1) is 


A=e,+eot---+e)-4 
= Al) 4 a (17.34) 


Thus, this UIR occurs in the reduction of the direct product AM) x A(?), Now 
let us examine antisymmetric tensors of rank / in some detail. Given such a 
tensor T', we can view the Levi-Civita symbol as defining a linear operator é 
taking T to T’ in the following way: 


[=e 
1 
Ta, = eA AAD AT Ay Al (17.35) 
It is easy to check that 
é? = (-1) (17.36) 


This permits the definition of self-dual and antiself dual antisymmetric tensors 
of rank J, as eigentensors of €; while this can be done in the real domain if ! is 
even, one has to go to the complex domain if 7 is odd. For uniform appearance 
of later results we choose the definitions, 


Self dual tensors: éT = i!’ T = (1 or i)T, 


according as / is even or odd (a) 
Antiself dual tensors: €7 = —i T = (—1 or —4)T, 
according as / is even or odd (b) 
(17.37) 


The existence of the operator € shows that the representation of SO(2/) 
given by rank J antisymmetric tensors is reducible: it splits into two UIR’s 
given respectively by self dual and antiself dual tensors. We wish to find the 
weights that occur in each of these UIR’s and in particular their highest weights. 

For further algebraic work it is convenient to introduce a formal set of ket 
vectors on which é and Ma, z can act, rather than continue to deal with tensors 
and their numerical components. Thus we shall have kets |AiA2---A;), with 


254 Chapter 17. Spinor Representations for Real Orthogonal Groups 


each A in the range 1,2,...,2I, subject to the following laws: 


(i) For any permutation P € S;, 
|P(AiA2...Ar)) = €p|A1 Ae... Ar), 
€p = parity of P= +1 

(ii) MasplA1 Ag... Ar) = i(6p4,|Ai Ao... At) — 644,|BAe... Ar) 
+ 5BA,|A1A. ‘ . Ay) - 54A,|AiB... Ar) | 


wih oe 1 
(iii) €|Ay Ag... Ay) = pear Aap ap] Ag ... At) 


=> n| Ay Ae eee A)), 
7= i” =1 ori ace. as! even or odd for self dual case, 
q7= i!’ = -1 or —i acc. as 7 even or odd for antiself dual case 


(17.38) 


While one can certainly introduce inner products among these kets, we do 
not need to do so. The first property above — a “Pauli Principle” — tells us that 
in any string A; Ag... A;, we can always assume that all entries are distinct; thus 
each relevant string is a subset of / distinct integers drawn from 1,2,...,2/. The 
signs chosen in the action of Mag are consistent with the SO(2/) commutation 
relations (14.6). In the space of the above kets, we want to find simultaneous 
eigenkets of H, Ho,...,H; and see what the corresponding weights np = {tia} 
look like. - 

For brevity, let us denote by A the string of indices A; Ag... A; occurring 
inside each basic ket. In a given A, a certain pair 2a — 1,2a out of 1,2,..., 21 
may not occur at all; or one member of this pair may be present but not the 
other; or finally both members of the pair may be present. Use of Eq.(17.38)(ii) 
shows that in the first and last cases, we obtain the eigenvalue zero for Hq: 


2a —1,2a ¢ A: H,|A) =0; 
2a — 1,20 € A: Ha|A) = 0. (17.39) 


In fact, the only possible eigenvalues of each H, (in the space of tensors 
we are considering here) are 0,11. Examples of eigenvectors of, say, H,) = Mj 
with eigenvalues +1 are easily built: 


|1 A Sac A) +- i|2Aq eg At) =|1 + 22, Ag tee A), 
1,2 ¢ Ao... Ar, 
[1 F i2, Ag... Ar) = 1 42, Ao... Ar) 
(17.40) 
Here we have introduced a compact way of expressing certain linear com- 


binations of the basic kets |A), that will be convenient in what follows. In any 
weight yz which occurs, then, each . = £1 or 0. 


17.4. Remarks on Antisymmetric Tensors Under D; = SO(2l) 255 


We must now see in a general way how to connect up a vector |uz...), a 
simultaneous eigenket of the H,, with the |A). For a while we will work only 
with Eq.(17.38)(i), (ii), and only later impose the é-related condition. In a basis 
ket |A), suppose the string A is characterised by I) completely absent pairs, l, 


pairs contributing only one member each, and lz pairs fully present. We easily 
see that 


Io +1, + lo = total number of pairs = I, 
lj + 2lp = number of entries in A = 1, 
ie. lp=h (17.41) 


On the other hand, suppose in a particular weight vector p = (j... p41), 
the entries 0,1,—-1 occur no,n; and n_, times respectively. Of course j: may 
be degenerate. In the expression of an eigenket |...) as a linear combination 
of the |), it must be clear because of Eqs.(17.39),(17.40) that the following 
relations between 79,41 on the one hand, and lo,l1,l2 on the other must be 
maintained in each term: 


No = lp + ly = 2lp = even, 
nhtniy=] (17.42) 


One more restriction on allowed weights jz emerges: the entry zero must 
occur an even number of times! One can now see why in general a weight p has 
multiplicity greater than one, and also when y will be simple. For each H, for 
which jg = +1, the disposition of the indices 2a — 1, 2a in the corresponding 
pair is completely fixed by Eq.(17.40), and there is no freedom left. But for the 
even number of H,’s for which jg = 0, for each such H, we have the possibility 
of either completely dropping or fully including its pair 2a — 1,2a. In fact, 
for half of these H,’s we must pick the former alternative and for the other 
half the latter alternative, since from the connecting relations (17.42) we have 
lb=h= dno! Thinking this through, one realises that 4 can have multiplicity 
only if no > 0; and conversely if no = 0, i.e., each zg = £1, then yp is simple. 

As an illustrative example, consider the weight vector y to be (1,1,...,1, 
—1,-1,...,-1,0,...): 


My = pe =" = fn, = 1, 
BMayt1 = Pny42 =''' = Unitrni = -l, 
ve = 0 fora=ny+n_1+4+1,...,l (17.43) 


Then the following linear combination of the basic kets |A) is a possible 
eigenket |j...): 


[u-..) ~ |[1—-i2, 3-44, ..., Iny -L—i-2ni; Qn, +14 i(2n1 +2), ..., 
Qn, + 2n_1 —1 +i(2n, +2n_1), some pairs). (17.44) 


256 Chapter 17. Spinor Representations for Real Orthogonal Groups 


By the words “some pairs” we mean some choice of half of the last no pairs 
in 1,2,...,21—1,2/. We see immediately why, if no = 0, yz is simple — this last 
choice does not have to be made, and the ket |) is fully determined. 

Now let us bring in é and ask for its effect on the ket appearing in Eq.(17.44). 
Going back to Eq.(17.40) for a moment, we see that 

a . +7 * ? ? 

é]1 F 22, Ao... Ar) = Ta yi thae AtAg Al |1 22, Ap... Aj) (17.45) 
it being understood that neither 1 nor 2 occur in Ag... A;. We can extend this 
to the ket appearing in Eq.(17.44); if for simplicity we omit some of the entries 
in this ket, we can write: 


é\1 — 12,3 —74,..., some pairs) = i7}(—i)"-? 
X €1,3,...,270142n-1—1,2,4,.-4221 +2n_1) Pairs 
x |1 — 42,3 —74,..., pairs absent on left), no sums on pairs 
, (17.46) 


It helps here to recognize that each pair of indices 2a — 1, 2a behaves like a 
“boson” and can be freely moved past other indices with no minus signs being 
produced! After bringing the e-component here to normal form, we can rewrite 
the above equation as 


él1 — i2;3 — i4;...; some pairs) =i" (—)"-!(—1)/2@1tn-1—- Dm tn—y) 


x |1—1i2,3—74,..., pairs absent on left) (17.47) 


We are now essentially done. If every ket obeys Eq.(17.38)(iii)) and so is 
an eigenket of € with eigenvalue 7, then on the possible simple weights 6 = {ta} 
in which each zg = +1 there is the restriction 


Mo = 0: pw simple: = gnet: (-#)*="> (=1/0-2? = 
2 


ie, 9 (—1)"-1 =" -2! (17.48) 


We will interpret this in a moment, but we also notice from Eq.(17.47) that if a 
tensor has a definite duality property, then any allowed weight p with np = 2, 
i.e., two of the i. being zero, is also a simple weight; nontrivial multiplicities 
show up only if nop = 4,6,...! 

Now we use Eq.(17.48) to answer the question: for a definite kind of duality 
property, what is the allowed highest weight A? We see from a combination of 
Egqs.(17.38),(17.48) that the pattern is as follows: 


Self dual case: nai? : ny =0,A=e, teot+... +e) 1 +e); 
Antiself dual case: 7 = -i” : n-1=1,A=e +@o4+...+€)_1 —& 


17.5. The Spinor UIR’s of By = SO(2i+ 1) 257 


(We see now the justification for the definitions adopted in Eqs.(17.37)!) We 
have uniformly the result that the direct product A“) x A“) produces, as 
the leading piece in its reduction, the UIR of rank / antisymmetric antiself 
dual tensors; while the direct product A‘?) x A) produces the UIR of rank 
l antisymmetric ‘selfdual tensors. A summary of these and other significant 
properties of SO(2l) antisymmetric tensors is given in the table: 


Antisymmetric tensors of SO(21): 


Rank Highest weight Status Remark 
1 &; Fundamental UIR D“) Defining UIR D 
2 £,+ £5 Fundamental UIR D®) Adjoint UIR. 
3 €; ten +85 Fundamental UIR D”) 
1-2 €; teg t+: +e)_9 Fundamental UIR — 
pit-2) 
I-11 €; tg +++ + &)_y Leading piece in UIR 
B® x A®@ 
ee Self-dual Leading piece in UIR 
l tet +e-, A® x A?) 
Antiself-dual Leading piece in UIR 
ey tegt-+e4 BY x AM 
ek 


17.5 The Spinor UIR’s of B; = SO(2i +1) 


The situation here is somewhat simpler than in the case of D;. As we have seen 
earlier, there is just one spinor UIR, with highest weight given in Eq.(17.2). Ac- 
tually, in comparison with the D; situation, things are slightly more complicated 
at the Dirac algebra level, and slightly simpler at the spinor UIR level! 

We begin by asking for an irreducible, hermitian solution to the “(2!-+ 1) 
dimensional” Dirac algebra: 


{ya YB} = 26aB, A,B=1,2,...,2/41 (17.50) 
It turns out that now there are two inequivalent solutions, each of dimension 
Q!: 


(i) We take ya, A = 1,2,..., 21 as constructed earlier in the case of D,, and 
to them we adjoin 


Y2lt1 = YF (17.51) 


All of Eq.(17.50) are obeyed, and furthermore the following algebraic rela- 
tion holds: 


yay2 +++ Yauvat41 = (-2)! (17.52) 
(ii) Take y,4 for A= 1,2,...,21 as before, but choose now 


‘iy. = —YF (17.53) 


258 Chapter 17. Spinor Representations for Real Orthogonal Groups 
£00) DEE ee ee 


The entire set 7, = (y1,-y2,--+» Yat) YF) again is a solution to (17.50); 
however in place of (17.52) we have a different algebraic relation, 


Ye Veter = —(-2)! (17.54) 


which is why no unitary transformation can connect the set ya to the set is. 
One does however have the connection 


Ya=—oryayp, A=1,2,...,2+41 (17.55) 


which we will use shortly. We shall generally work with the representation (i) 
above. 


If S = (Sap) € SO(2l +1), by continuity and preservation of the algebra 
(17.50) we see that Sga7yg must be unitarily related to ya, not to yj: 


S € SO(2l +1): Ssaye = U(S)yaU(8)™, 
U(S')U(S) = (phase) U(S'S). (17.56) 


Similarly, we have 


S €SO(2 +1) : Sears = U'(S)74(U"(S)) 


U'(S’)U'(S) = (phase) U'(S'S). (17.57) 
The generators Map for U(S) and M4, for U’(S) are unitarily related, 
Map = glrave] 
Mie = gla 7 =pMaByF 
U'(S) = yrU(S)yp* (17.58) 


It is here that we use Eq.(17.55). Thus, while we do have two inequivalent 
representations of the Dirac algebra in odd dimensions, they lead to equivalent 
generators for SO(2! + 1), and so to essentially just one spinor UIR. In fact the 
irreducibility of U(S) follows from that of M,z, which in turn is a consequence 
of our being able to express each ya as a product of / of the M’s. One can 


depict the situation pictorially thus: 
| Re 
Dirac unique 


algebra inequivalent equivalent 
1 


in (241) | UIR A of 
dimensions | | SO(2#1) 
v v 


17.5. The Spinor UIR’s of B; = SO(2! +1) 259 


Hereafter let us stick to ya, Mag and U(S). The space of this UIR is V of 
dimension 2', with the basis |{e}) set up in Eq.(17.14). The Cartan subalgebra 
for SO(2i + 1) is the same as for SO(2/), and the highest weight is as expected, 
A= (1/2,1/2,...,1/2). As for the behaviour of U(S) under conjugation, it is 
necessarily self conjugate, and the same C’ matrix will do as previously; from 
Egs.(17.18a), (17.24), (17.31), (17.51) we have uniformly: 


CyaC—! = (-1)'y4, A=1,2,...,2841; 
CMasgc™ = —M1,; 
(U(S)7)-! = Cu(s)c™ (17.59) 


Since C’ is sometimes symmetric and sometimes antisymmetric, as deter- 
mined by Eq.(17.32), U(S) is correspondingly potentially real or pseudoreal: 


l=4m or 4m+3: A of SO(2! + 1) is potentially real; 
l=4m+1or4m+2: A of SO(2I + 1) is pseudo-real (17.60) 


Combining these results with those of Section 17.3, we see that as a whole 


the properties of the spinor UIR’s of the orthogonal groups exhibit a “cycle of 
eight” structure. 


fe ee 
SO(8m + 1) = Bam Real, dim. 24” 


SO(8m + 2) = Dam+1 | Mutually conjugate, 
dim. 24” 
SO(8m + 3) = Bam+1 
dim. 247+1 
SO(8m + 4) = Dam+e2 Pseudo-real, 
dim. gimtl 
SO(8m + 5) = Bam+2 Pseudo-real, 
dim. pAm+2 
SO(8m + 6) = Dam+3 | Mutually conjugate, 
dim. gamt2 


SO(8m + 7) = Bam+3 Real, dim. 24" +3 


This table, and the behaviour of dimensionalities, tempts us to conclude this 
section with the following remarks. Take the spinor UIR A of B; = SO(2/+1), of 
dimension 2'. Each of the sets of matrices y4 and Maz, for A,B = 1,2,...,21+ 


260 Chapter 17. Spinor Representations for Real Orthogonal Groups 


1, is separately irreducible. Evidently, Mas and ya form a Lie algebra; and so 
do Mag and —7y,4. These just give the two inequivalent spinor UIR’s AQ Al) 
of Di41 = SO(21 + 2), both of dimension 2!, in some order. In fact, 


1 
Ma 21+2 => F5vaA = 1,2,...,214+ 11> 


1 
A= 3 (en +g tes +8) Fea) > 
AY or A®) of Dizi 
(17.61) 


17.6 Antisymmetric Tensors under B; = 
SO(2/ + 1) 


Here again the situation is much simpler than with D,;. There is no duality 
operation on antisymmetric tensors of a given rank. The presence of the Levi- 
Civita tensor €,,,...,4214, allows us to limit ourselves to antisymmetric tensors of 
ranks 1,2,...,¢-—1,/. All but the last are bases for fundamental UIR’s. As noted 
in Section 16.5, the [th rank antisymmetric tensor UIR has as its highest weight 
twice the highest weight of the spinor UIR A. All the pertinent information 
about B; antisymmetric tensors can thus be summarised in this fashion: 


Antisymmetric tensors of SO(2i + 1): 


Rank Highest weight Status Remark 
1 ey Fundamental UIR D() Defining 
UIR D 
2 €) + £2 Fundamental UIR D?) = Adjoint UIR 
3 €, + £9 + &3 Fundamental UIR D‘&) — 


l~1 e,+@,+-+--+¢,., Fandamental UIR pii-1) a 
Y ey teg ts +e; Leading piece in A x A UIR 


Chapter 18 


Spinor Representations for 
Real Pseudo Orthogonal 
Groups 


The pseudo orthogonal group SO(p, qg) is the connected proper group of real lin- 
ear transformations in a (p+q)-dimensional “space-time” preserving a nondegen- 
erate but indefinite metric with signature (+ +---+---+-- ), there being p plus 
and q minus signs. While its Lie algebra is simple, (provided (p,q) # (2, 2)!), 
it is a noncompact Lie group, and all its nontrivial unitary representations are 
infinite dimensional. For many physical applications, however, one is interested 
in the spinor representations of these groups, and furthermore in various kinds 
of spinors. These representations are finite dimensional and non-unitary. We 
can use all the algebraic results we have put together in Chapter 17, to pro- 
vide a discussion of spinors for SO(p,q). Naturally the behaviours of spinors 
depend very much on whether the total dimensionality p + g is even or odd, 
and whether the number of minus signs in the metric is even or odd. We give a 
concise account of these matters in this Chapter. 


18.1 Definition of SO(qg,p) and Notational 
Matters 


We shall use Greek indices u,v,... to go over the full range 1,2,...,.p +4; 
early Latin indices j,k,... to go over the positive metric “spatial” dimensions 
1,2,...,p; and late Latin indices 7,s,... to go over the negative metric “time- 
like” dimensions p + 1,p + 2,...,p +9. The metric tensor y,, is diagonal with 


1 = N22 = °° = Np = +1 
Np+iptl = **' = Npigpte = —1 (18.1) 


262 Chapter 18. Spinor Representations for Real Pseudo Orthogonal Groups 


This metric will also be used, along with its inverse 7”, for lowering and 
raising Greek indices. 
The group SO(p, q) consists of all real p+ g dimensional matrices A = (A+,) 
obeying the conditions 
AY Aur = Np AYA’, = tra, 
det A= +1, (18.2) 
and in addition the condition that it lie in the component continuously connected 
to the identity. (Often this last condition is indicated by denoting the group in 
a more specific way than by just SO(p, q), but we shall leave it implicit). After 


removal of the conventional “quantum mechanical i”, the Lie algebra of SO(p, q) 
is spanned by M,, = —M,, obeying the commutation relations 


[Myv, Moo) = U(nupMyo — NupMrve + Nuo Mop — NueM pv) (18.3) 


In a nontrivial unitary representation of SO(p,q), which is necessarily in- 
finite dimensional, the M,, would be hermitian, or more precisely self adjoint 
operators. In finite dimensional non-unitary matrix representations, however, 
we can assume without loss of generality that 


M a = Mj, = hermitian SO(p) generators, 
M}, = M,s = hermitian SO(q) generators, 
Mi, = —M;, = antihermitian generators (18.4) 
These properties are of course consistent with Eqs.(18.3). 
In working out the spinor representations of D; and B; we constructed 
hermitian irreducible matrices y4 in Eqs.(17.7), (17.17). We shall now write 


0) for them: 


+0) = previous ya, A = 1,2,...,205 
0 
al = YF: m8) 


18.2 Spinor Representations S(A) of SO(p, g) 
for p+qg=2l 


The appropriate Dirac algebra is 
{4%} = 2h hy = 1,2,...,.p+q=2l (18.6) 


Upto equivalence, there is a unique irreducible representation, on a space 
VY of dimension 2'. We take the solution 


0 . 
eae, Sleep 


Yr >= ay =i 7p = ptl,..., 21 (18.7) 


18.2. Spinor Representations $(A) of SO(p,q) for p + q = 2i 263 


Thus the hermiticity relations can be expressed as 


t= nue s no sum (18.8) 


With this hermiticity property specified, the solution to (18.6) is unique upto 
unitary equivalence. 
For any A € SO(p, q), 

Wy = AY (18.9) 
is a solution to (18.6), if y, is. (However in general ), will not enjoy the 
hermiticity properties (18.8), unless A € SO(p)x SO(q)). Therefore there must 
be a similarity transformation connecting es and +y, as 


AY Ww = S(A)ypS(A)7}, 
S(A’)S(A) = +S(A’A). (18.10) 


This is the origin of the 2'-dimensional (reducible) “Dirac spinor” representation 
S(A) of SO(p, q). It is in fact a double-valued representation, or more precisely, 
a representation of the universal covering group SO(p,q), which in some cases 
is more than a double covering of SO(p, q). 

The infinitesimal] generators of $(A) such that 


S(A) ~ exp (ut Myr) 


Wyy = —Wyy, = real, (18.11) 


t 
Muy = ql wv). (18.12) 
One easily checks the validity of both Eqs.(18.3), (18.4), and also of 
[Myv, Yo] = U(Mup Yu = Np) (18.13) 


The reduction of S(A) is achieved with the same yr constructed in Section 
17.2 when discussing D; spinors. We find: 


{9 we} =0 => {yu} = 0 > 
[Myv, Yr] =0 => 
S(A)yr = yr S(A). (18.14) 
Denote the eigenspaces of yr for eigenvalues +1 as Vi: these are the V2 and 


V; of Section 17.2, Eq.(17.19). Then S(A) restricted to these 2'~!-dimensional 
subspaces gives us two irreducible spinor representations Si(A) of SO(p, q): 


y = Va @V_, 
Si(A): 0 


S(A) = (18.15) 


264 Chapter 18. Spinor Representations for Real Pseudo Orthogonal Groups 


18.3 Representations Related to S(A) 


One can pass from the Dirac spinor representation of SO(p, g) to its adjoint, con- 
tragredient or conjugate, as defined in Section 8.5. The representation matrices, 
and their generators, behave as follows: 


S(A), My, — adjoint: (S(A)*)"1, Mj. 
— contragredient: (S(A)7)~1, -—M A 
— conjugate: S(A)*,-Mi, (18.16) 


To relate each of these to S(A), we need matrices relating y, to vor and 
vj. These are the generalisations of the familiar A,B,C matrices of the Dirac 
equation. 


Define the matrix A by 


‘ 0 0 
Am per tpt2 ++ Yptg = iH), 9, (18.17) 


It is the product of all the “time-like” gammas. Such a matrix was not needed 
in the D; analysis. It obeys 
yf = (-1)*Ay,A™ (18.18) 


Since the y, are irreducible, such an A is unique upto a factor, and we choose the 
specific one in Eq.(18.17). To pass to the transpose, we can use C' constructed 
in Section 17.3; it obeys Eq.(17.24) which we repeat here as 


yy = (-1)'Cy,C71 (18.19) 


Again, the irreducibility of y, means such a C’ is unique upto a factor. Com- 
bining Egqs.(18.18),(18.19) with the unitarity of C we get the behaviour under 
complex conjugation: 


vf, = (-1)"'CAy, (CA)? (18.20) 
The important algebraic properties of A and C are: 


cT=Cct=c = (144 ¢, 


At = Au} = (-1)%9D/2 4, (a) 
AT = (=1)8+/2)8@-D GACH}, 
A* = (—1)%*% CAC". (b) (18.21) 


Thanks to the relations (18.18),(18.19) and (18.20), we see that the Dirac 
spinor representation S(A) goes into itself under each of the three operations 
(18.16): it is self adjoint, self contragredient and self conjugate. These are 
achieved thus: 


M}, = AM, Aq": (S(A)')~! = AS(A)A™}; 
—Mj, = CMC: (S(A)7)7? = CS(A)c7}; 
—M}, = CAM,.(CA)"*: S(A)* = CAS(A)(CA)7* (18.22) 


18.4. Behaviour of the Irreducible Spinor Representations S(A) 265 
eee er eee AE AD 


In these relations since S(A) is reducible we could have replaced A and C by 
Af (yr) and Cg(yr), when f(yr) and g(yr) are nonsingular. 


18.4 Behaviour of the Irreducible Spinor 
Representations S,(A) 


The consequences of Eqs.(18.22) for the irreducible representations S4.(A) con- 
tained in S(A) depend on the behaviours of A and C with respect to yr, and 
so ultimately on the parities of 1 and g. We find that 


Cyr = (-1)'yrC, 
Ayr = (-1)"yrA, 
CAyr = (-1)%*!ypCA (18.23) 


It is useful to recall that if we have a Dirac spinor 7 belonging to the 
representation $(A), then 


poe =S(Ayp= 
vt yt = ytAs(A)' A Ss 
b> = $S5(A) P= plas 
wp = invariant (18.24) 


On putting together the reduction (18.15) of S(A) and the properties (18.23), 
we get the following pattern of results: 


l q Equivalent to Sx Equivalent toS_ Remark on oy 
even even (S1)-1,(ST)-!,5%  (St)-1(S7)-1,5* No mixing of 
chiralities 
even odd (S7)~1,(St)-1,S*  (S7)-1,(S4)-1, St Mixes 
chiralities 
odd even (S1)~1,(ST)-1,5* (St)-1,(ST)-1,S% No mixing of 
chiralities 
odd odd St,(St)-1,(S7)-!_ $=, (St)-1,(ST)-} Mixes 
chiralities 


One can see that the “Dirac mass term” 7 mixes chiralities if g, the 
number of time like dimensions, is odd, and not if it is even. The above list of 
properties is relevant in discussing Weyl and Majorana spinors. 


266 Chapter 18. Spinor Representations for Real Pseudo Orthogonal Groups 


18.5 Spinor Representations of SO(p, q) for 


pt+q=21+1 
The Dirac algebra reads 
{%:%w} = 2, pv =1,2,..., 22,2141 (18.25) 


On the space V of dimension 2', we have two irreducible inequivalent rep- 
resentations possible: 


(1) 99 = Ose = 19> 7 = pt... ptq—-1=2; 
Ya41 = t9F 
=»). (0) » — 1 eee 
(ii) 9; Yr =iy,’,r=ptl,...,pt+q ; 
Ya+1 = -1YF (18.26) 


But, as it happened in the case of B;, they will lead to equivalent spinor 
representations and generators, so we stick to representation (i) above in the 
sequel. 


The Dirac spinor representation and its generators are now irreducible on 
y: 


A € SO(p, 9) - 4, = A’ = S(A) yp S(A)?, 
S(A’)S(A ) = +5(A'A), 
t 
S(A) ~ exp (-5o"” Mw) , 


t 
Mayv = race yw] (18.27) 


The hermiticity and commutation relations (18.3), (18.4) are again satisfied. 

Irreducibility of S(A) immediately tells us it must be again self adjoint, self 
contragredient and self conjugate. For these purposes, the same C' matrix can 
be used as previously, but the A matrix is different: 


A = p41 Yp+2-+- Ypta 


=i, ne 
= 1x (A-matrix for SO(p,q -1)) x yr (18.28) 


One has again the adjoint and other properties 
Wh = (-1)1Ay, AT, 
Wd = (-1'Cy,C7}, 
Y= (-1)'CAy (CA), wp =1,2,...,284+1 (18.29) 


Moreover, even with the new definition of A, all of Eqs.(18.21), (18.22) continue 
to be valid with no changes at all, so we do not repeat them. The only difference 
is that since M,, and S(A) are now irreducible, there is now no freedom to 
attach non-singular functions of yz to A and C in Eq.(18.22). 


18.6. Dirac, Weyl and Majorana Spinors for SO(p, q) 267 
—_—_—_—_—_ 


18.6 Dirac, Weyl and Majorana Spinors for 
SO(p, q) 


For both p+ q = 2! = even, and p+q = 21 +1 = odd, we work in the same 
space V of dimension 2'. In the even case, S(A) reduces to the irreducible parts 
S4(A) (corresponding to yz = +1); in the odd case, it is irreducible. 

For any p+q: a Dirac spinor is any element ~ in the linear space Y, subject 
to the transformations S(A) as in Eq.(18.24). Weyl and Majorana spinors are 
Dirac spinors obeying additional conditions. 


Weyl Spinors 


These are defined only when p+ q = 21. A Weyl spinor is a Dirac spinor w 
obeying the Weyl condition: 


YF = ey, €y = £1. (18.30) 


Depending on ¢,,, we get righthanded (positive chirality) or lefthanded (neg- 
ative chirality) spinors: 


ew =+1: PEV,,y’ =S,(A)y: right handed 
éy =—1: pe v_,y’ =S_(A)y: left handed (18.31) 


If p+q = 21+1, we do not define Wey] spinors at all. 


Majorana Spinors 


These can be considered for any p + q, and are Dirac spinors obeying a reality 
condition. We know that for any p+ q, we have 


S(A)* = CAS(A)(CA)™! (18.32) 
so for any Dirac spinor , 
py = S(A)p > 
(CA)71p'* = S(A)(CA)7!y" (18.33) 


That is, a Dirac w and (CA)~1'y* transform in the same way. We have a 
Majorana spinor if these two are essentially the same. Three possible situations 
can arise, which we look at in sequence. First we gather information regarding 
two important phases £,7 connected with CA. For any p,q we have under 
transposition: 


(CA)? = ECA, 
f= (—1) 24+) + 20(a-D tla (18.34) 


268 Chapter 18. Spinor Representations for Real Pseudo Orthogonal Groups 


Here we have used Eq.(18.21(b)), and have left the dependence of € on / and 
q implicit. [Remember also that the definition of A depends on whether p + q 
is even or odd, but that since Eqs.(18.21) are uniformly valid, so is Eq.(18.34) 
above]. On the other hand, all the matrices C,A,CA are unitary, so we also 
have 


(CA)*(CA) =€ (18.35) 
With respect to yr, we have the property (18.23) relevant only if p + q = al = 
even, and written now as 


CAyp = nyrCA,n = (-1)'"9 (18.36) 


So, the phase € is defined for all p and g, whether p+ q = 2! or 2/+1; while the 
phase 7 is defined only when p+ q = 2l. 

The three situations are as follows: 

(i) Suppose p + gq = 2! + 1, so S(A) is irreducible. A Majorana spinor is a 
Dirac spinor which for some complex number a@ obeys 


(CA)*p* = a 
ie,  y =aCAy (18.37) 


Taking the complex conjugate of this condition, using it over again, and then 
Eq.(18.35), one finds the restriction 


la)?€=1 (18.38) 


Thus, a must be a pure phase, which can be absorbed into %. A Majorana 
spinor can therefore exist in an odd number of dimensions if and only if the 
phase = 1. The sufficiency is shown by the following argument: if € = 1, then 
the matrix CA is both unitary and symmetric, so by the argument of section 10 
it possesses a complete orthonormal set of real eigenvectors. Thus nonvanishing 
a obeying the Majorana condition (18.37) do exist. 

(ii) Suppose p+q = 2I, and we ask if there are spinors having both Weyl and 
Majorana properties. Now S(A) is reducible, so we must contend with the fact 
that ~* transforms in the same way as f(yr)C Ay for any function f(-). Let us 
ask for the necessary and sufficient conditions for existence of Majorana—Wey] 
spinors. The Wey] condition says. 


yy = ty, 


ie, p= ( 4 ) or () (18.39) 


Given that ~ is of one of these two forms, the Majorana condition says 
wy* = aCAw (18.40) 


where there is no need to include the factor f(yr). Unless CA is block diagonal, 
w would vanish, so a necessary condition is 7 = 1. Given this, it follows that 


18.6. Dirac, Wey] and Majorana Spinors for SO(p, ¢) 269 


€ = 1 is also a necessary condition. But these are also sufficient! For, if = 7 = 
1, we have 
He 2. 36 
CA= ee eee (18.41) 
0 : Ke 
with both K, and Ko being unitary and symmetric. Then each of K; and Ke 
has a complete orthonormal set of real eigenvectors, and we can have Majorana- 
Weyl] spinors with e,, = +1 or —1. 
Thus, € = 7 = 1 is the necessary and sufficient condition for Majorana- 
Weyl spinors to exist. In this case, plain Majorana spinors not having the Weyl 


property are just obtained by putting together Majorana—Wey] spinors with 
both ¢,, values, and the general Majorana condition is 


oe” = f(yr)CAp (18.42) 


(iii) Suppose p + g = 21, but Majorana-Wey] spinors do not exist. Then 
either € = —1,7 =1 or € = 1,7 = —1 or £ = 7 = —1. In each case let us see if 
Majorana spinors exist. 


If € = —1,7 = 1, and we write CA in the block diagonal form (18.41), then 
both K; and Ko are unitary antisymmetric: 


Ki kK, = Kj K2=-1 (18.43) 
The most general Majorana condition on is 


vy" = flvr)CAy, 


CG S)(E LO om 


where a, 8 are complex numbers. But the fact that € = —1 forces both y and 
x to vanish, so there are no Majorana spinors at all. 


If 7 = —1,€ = +1, then CA is block-off-diagonal: 


0 K 
KiK =1 (18.45) 


The general Majorana condition on y is 


yf == f(yr)CAY, 


: g*\ fa 0 0 K yp 
1.€., x" = 0 B €KT 0 x ? 
ie, yp" =aKx,x* = BEKT 


ie., y= at K* BEKT 
ie, a'B=€ (18.46) 


270 Chapter 18. Spinor Representations for Real Pseudo Orthogonal Groups 


Thus, both a and @ must be non-zero, and similarly both y and y must be 
non-zero, and such Majorana spinors definitely exist. 

Collecting all the above results, we have the following picture telling us 
when each kind of spinor can be found: 


Weyl Majorana Majorana—Wey] 


p+q=24+1 x E=1 x 
pt+q=2l Always, E=n=1; £=n=1 
&=tl €=+t1,7=-1 


All these results are based on group representation structures alone. We 
have also assumed that each » is a “vector” in VY with complex numbers for 
its components. If they are Grassmann variables, or if we ask for the Weyl 
or Majorana or combined property to be maintained in the course of time as 
controlled by some field equations, other new conditions may arise. 

As an elementary application let us ask in the case of the groups SO(p, 1) 
when we can have Majorana—Weyl spinors. Now g = 1 and so p must necessarily 
be add so that p+ 1 = 2i can be even. The necessary and sufficient conditions 
are 


(—1) 34+) +4 =1, 


E = 
n=(-1*?=1 

This limits | to the values | = 5,9,13,... (omitting the somewhat trivial 
case | = 1), so we have Majorana—Weyl spinors only for the groups SO(9, 1), 
5O(17, 1), SO(25, 1)...in the family of Lorentz groups SO(p, 1). 


Exercises for Chapters 17 and 18 


1. Show that an alternative representation for the gamma matrices of Eq.(17.7) 
is 
Yer = oy aa og Vale). 


2. For the construction of the spinor representations of D, = SO(2l) in Sec- 
tions 17.1, 17.2, reconcile the irreducibility of the y4 with the reducibility 
of the generators Maz. 


3. For the group SO(6) which is locally isomorphic to SU(4), trace the con- 
nection of the two spinor UIR’s A“) and A) to the defining representa- 
tion of SU(4) and its complex conjugate. 


4. For the case | = 3,D, = SO(6), supply proofs of the results stated in the 
table at the end of Section 17.4. 


5. Verify all the stated relations in Eqs.(18.18), (18.19), (18.20), (18.21), 
(18.22). Similarly for Eqs.(18.29). 


18.6. Dirac, Weyl and Majorana Spinors for SO(p,q) 271 


6. Check that the 4 component spinor 7 in the familiar Dirac wave equation 
is the direct sum of two irreducible two-component spiners of SO(3, 1) 
transforming as complex conjugates of one another. 


Bibliography 


[1] R. Brauer and H. Weyl: ‘Spinors in n dimensions’, Am. J. Math. 57, 425- 
449 (1935). 


[2] G. Racah: ‘Group Theory and Spectroscopy’, Princeton lectures, 1951, 
reprinted in Ergebn. der. Exakten Naturwiss, 37, 28-84 (1965). 


[3] W. Pauli: ‘Continuous Groups in Quantum Mechanics,’ CERN Lectures, 
1956; reprinted in Ergebn. der. Exakten Naturwiss. 37, 85-104 (1965). 


[4] L. Pontrjagin: Topological Groups, Princeton University Press, 1958. 


[5] R.E. Behrends; J. Dreitlein, C. Fronsdal and B.W. Lee, “Simple Groups 
and Strong Interaction Symmetries”, Revs. Mod. Phys., 34, 1-40, 1962. 


[6] M. Hamermesh: Group Theory and its Applications to Physical Problems, 
Addison Wesley, 1962. 


[7] A. Salam: “The Formalism of Lie Groups”, Trieste Lectures, 1963. 


[8] H. Boerner: Representations of Groups with special consideration for the 
needs of Modern Physics, North Holland, 1970. 


[9] E.C.G. Sudarshan and N. Mukunda: Classical Dynamics — A Modern 
Perspective, Wiley, New York, 1974. 


[10] R. Gilmore: Lie Groups, Lie Algebras and Some of their Applications, 
Wiley, New York, 1974. 


[11] B.G. Wybourne: Classical Groups for Physicists, Wiley, New York, 1974. 


{12] R. Slansky: ‘Group Theory for Unified Model Building’, Phys. Rep., 79C, 
1, 1981. 


[13] P. Langacker: ‘Grand Unified Theories and Proton Decay’, Phys. Rep., 
72C, 185, 1981. 


[14] J. Wess: ‘Introduction to Gauge Theory’, Les Houches Lectures, 1981. 


[15] H. Georgi: Lie Algebras in Particle Physics, Benjamin/Cummings, 1982. 


274 Bibliography 


[16] F. Wilczek and A. Zee: ‘Families from Spinors’, Phys. Rev., D25, 553, 
1982. 


[17] P. van Nieuwenhuizen: “Simple supergravity and the Kaluza-Klein Pro- 
gram,” Les Houches Lectures, 1983. 


{18] A.P. Balachandran and C.G. Trahern: Lectures on Group Theory for Physi- 
cists, Bibliopolis, Napoli, 1984. 


(19] A.O. Barut and R. Raczka: “Theory of Group Representations and Appli- 
cations,” World Scientific, Singapore, 1986. 


Index 


A, 205 

Bi, 199 

C@™ atlas, 43 

C@™ compatibility, 43 
Ci, 201 

Dy, 195 

SU(3) root system, 191 
€ tensor, 75 

n-system, 192 

2-form, 58 


affine connection, 73 

antisymmetric tensor representations, 
250, 258 

arcwise connectedness, 22 

associativity, 147 

atlas, 43 

atlas, maximal, 44 

automorphism of groups, 130 


barycentric coordinates, 89 
basis for a topology, 11 
Betti number, 98 

Bianchi identity, 72 
bijective, 6 
Bohm-Aharonov effect, 103 
boundary of a chain, 93, 94 
boundary of a manifold, 81 
boundary of a set, 18 
boundary of a simplex, 92 


canonical coordinates on a Lie group, 
149 

Cartan classification, 172 

Cartan subalgebra, 173 

Cartan-Weyl form of Lie algebra, 178 

Cartesian coordinates, 42 


Cartesian product, 6 
chain, 92 

chart, 42 

closed form, 85, 100 

closed interval, 7 

closed set, 7 

closed simplex, 89 

closure, 13 

commutator subgroup, 126 
compact Lie group, 132 
compactness, 15 
compatibility of atlases, 44 
compatibility of charts, 43 
complement, 5 
complexification of real Lie algebra, 169 
conjugate elements, 124 
connected spaces, 13 
connection, 70 
connectivity, 21 

continuity, 5 

continuous function, 17 
contractible space, 29, 114 
coset, 125 

cotangent bundle, 111 
cotangent space, 56 
covariant derivative, 70, 72 
cover, 14 

curvature 2-form, 72 

cycle, 94 


de Rham cohomology, 80, 100, 101 

de Rham’s theorem, 103 

diffeomorphism, 48 

differentiable fibre bundle, 112 

differentiable structure, 44 

differential forms in electrodynamics, 
60 


276 


differential forms, integration of, 77 
differential map, 54 

differentiation on manifolds, 47 
dimension of a chart, 42 

dimension of a space, 41 

Dirac algebra, 244 

direct product of groups, 131 

direct product of representations, 144 
direct sum of Lie algebras, 161 
discrete topology, 9 

dual vector space, 55 

Dynkin diagram, 192 


element, 5 

empty set, 5 

equivalence class, 25 
equivalence relation, 24 
Euler characteristic, 98 
Euler relation, 99 

exact form, 85, 100 

exact sequence, 95 
exterior derivative, 57, 59 


faces of a simplex, 89 
factor algebra, 159 
factor group, 126 
fermions, 69 

fibre bundle, 105, 108 
first homotopy group, 26 
frame, 67 

frame bundle, 112 
function on a manifold, 50 
fundamental group, 26 
fundamental UIR, 231 


generators of a representation, 163 
global section, 113 
group, abelian, 126 
group, commutative, 126 
group, definition, 123 
group, nilpotent, 128 
group, semisimple, 128 
group, simple, 128 
group, solvable, 127 
group, topological, 132 
groups, exceptional, 222 


Index 


harmonic form, 85, 103 
Hausdorff space, 18 
Heine-Borel theorem, 14 
Hodge decomposition theorem, 85 
Hodge dual, 76 
homeomorphism, 17 
homology group, 95 
homomorphism, 27 
homomorphism of groups, 129 
homotopy, 21 : 
homotopy group, higher, 34 
homotopy of maps, 28 
homotopy type, 29 


ideal, 159 

image, 6 

image of a map, 95 

indiscrete topology, 8 

injective, 6 

inner automorphism, 130 
integer, 5 

integrability conditions, 152 
integral of a form, 79 
integration of forms, 77 
interior of a set, 18 

interior points of a manifold, 81 
intersection, 5 

invariant subalgebra, 159 
invariant subgroup, 125 

inverse function, 15 

isometry, 77 

isomorphism, 27 

isomorphism of groups, 130 
isomorphism of Lie algebras, 158 


Jacobi identity, 155 
kernel of a map, 95 


Laplacian, 84 

Levi splitting theorem, 171 
Levi-Civita spin connection, 73 
Lie algebra, 155 

Lie algebra, abelian, 159 

Lie algebra, semisimple, 160 
Lie algebra, simple, 160 


Index 


277 


Lie algebras, exceptional, 213 
Lie bracket, 63 

Lie group, 132 

limit point, 13 

linear functional, 55 

local coordinates in a Lie group, 145 
local section, 112 

locally finite cover, 79 

locally finite refinement, 79 
loop, 21 

loop, constant, 24 

loops, based, 23 

loops, homotopic, 24 

loops, multiplication of, 23 


Mobius strip, 48 

manifold, differentiable, 41, 44 
manifold, orientable, 48 
manifolds, diffeomorphic, 48 
manifolds, isometric, 77 
metric, 9, 66 

metric spaces, 9 

metric tensor, 66 

metric topology, 10 
multiplicities, 225 

multiply connected space, 29 


nondegenerate roots, 173 
normal space, 19 

normal subalgebra, 159 
normal subgroup, 125 


one-parameter subgroup, 149 
one-to-one, 6 

onto, 6 

open cover, 14 

open disc, 10 

open interval, 7 

open set, 6 

open simplex, 89 
orientability, 48 

oriented simplex, 91 
orthonormal frame, 67 
outer automorphism, 130 


paracompact space, 79 


parallelisability, 115 

partition of unity, 79 

path, 22 

path connectedness, 22 
Poincaré’s lemma, 103 
polyhedron of a simplicial complex, 91 
positive root, 187 

principal bundle, 110, 112 
product manifold, 105 
product space, 105 

proper subgroup, 124 
pseudo-orthogonal groups, 259 


rational number, 5 

real number, 5 

reflexive relation, 24 

regular domain, 81 

representation, 135 

representation of a Lie algebra, 163 
representation, adjoint, 140 
representation, complex conjugate, 140 
representation, contragredient, 140 
representation, decomposable, 136 
representation, fully irreducible, 138 
representation, indecomposable, 136 
representation, irreducible, 136 
representation, non-unitary, 140 
representation, orthogonal, 140 
representation, unitary, 139 
representations, equivalent, 138 
Riemannian manifold, 67 
Riemannian metric, 66 

Riemannian volume form, 75 

root vector, 174 

roots, 173 


scalar density, 78 

Schur’s lemma, 139 
semidirect product, 131 
semidirect sum, 161 
separability, 18 

set, 5 

simple root, 187 

simplex, 87 

simplicial complex, 87, 90 
simplicial homology, 87 


278 Index 


simply connected space, 29 weight, highest, 228 
SO(21), 195 weights, 181, 225 
SO(2I + 1), 199 Weyl group, 226 


solvability, 171 

spin connection, 70 

spinor representation, 233, 243, 246, 
255 

spinor, Dirac, 264 

spinor, Majorana, 264, 265 

spinor, Majorana-Wey]l, 268 

spinor, Weyl, 264 

Stokes’ theorem, 80 

structure constants, 152 

SU(I + 1), 205 

subalgebra, 159 

subgroup, 124 

subset, 5 

surjective, 6 

symmetric relation, 24 


tangent bundle, 111 
tangent space, 52 
tangent vector, 51 
tensor field, 53 
topological space, 6, 8 
topology, 6, 8 

torsion 2-form, 71 
torsion of affine connection, 73 
transitive relation, 24 
triangulation, 87 
trivial bundle, 110 
trivial subgroup, 124 


UIR, 140 

union, 5 

universal covering group, 158 
USp(2!), 201 


vector bundle, 112 
vector field, 53 

vertices of a simplex, 89 
vielbein, 67 

volume form, 75 


wedge product, 58 
weight, dominant, 228 


Lectures on Advanced 
Mathematical Methods 
for Physicists 


his book presents a survey of Topology and 
Differential Geometry and also, Lie Groups and 


Algebras, and their Representations. The first topic 
is indispensable to students of gravitation and related 
areas of modern physics (including string theory), while 
the second has applications in gauge theory and particle 
physics, integrable systems and nuclear physics. 


Part | provides a simple introduction to basic topology, 
followed by a survey of homotopy. Calculus of 
differentiable manifolds is then developed, and a 
Riemannian metric is introduced along with the key 
concepts of connections and curvature. The final 
chapters lay out the basic notions of simplicial homology 
and de Rham cohomology as well as fibre bundles, 
particularly tangent and cotangent bundles. 


Part II starts with a review of group theory, followed 

by the basics of representation theory. A thorough | 
description of Lie groups and algebras is presented with 
their structure constants and linear representations. 

Root systems and their classifications are detailed, and 

this section of the book concludes with the description ‘ 
of representations of simple Lie algebras, emphasizing 

spinor representations of orthogonal and pseudo-  , 

orthogonal groups. 


The style of presentation is succinct and precise. 
Involved mathematical proofs that are not of primary 
importance to physics student are omitted. The book 
aims to provide the reader access to a wide variety of 
sources in the current literature, in addition to being 
a textbook of advanced mathematical methods for 
physicists. 


ISBN-13 978-981-4299-73-2 


World Scientific Si 87 


981 


8 


www.worldscientific.com | 
7 9973 


7693 he 9 2 


4 


